drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
The commit ce410410f1a7 ("iommu/arm-smmu-v3: Add arm_smmu_strtab_l1/2_idx()")
calculated the last index of L1 stream table by 1 << smmu->sid_bits. 1
is 32 bit value.
However AmpereOne has 32-bit stream id size. This resulted in
ouf-of-bound shift. The disassembly of shift is:
ldr w2, [x19, 828] //, smmu_7(D)->sid_bits
mov w20, 1
lsl w20, w20, w2
According to ARM spec, if the registers are 32 bit, the instruction actually
does:
dest = src << (shift % 32)
So it actually shifted by zero bit.
This caused v6.12-rc1 failed to boot on AmpereOne and UBSAN also
reported:
UBSAN: shift-out-of-bounds in drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c:3628:29
shift exponent 32 is too large for 32-bit type 'int'
CPU: 70 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.12.0-rc1 #4
Hardware name: ZOLLNER SUNMOONLAKE/SunMoon Lake, BIOS 00.00. 2024-08-28 18:42:45 08/28/2024
Call trace:
dump_backtrace+0xdc/0x140
show_stack+0x20/0x40
dump_stack_lvl+0x60/0x80
dump_stack+0x18/0x28
ubsan_epilogue+0x10/0x48
__ubsan_handle_shift_out_of_bounds+0xd8/0x1a0
arm_smmu_init_structures+0x374/0x3c8
arm_smmu_device_probe+0x208/0x600
platform_probe+0x70/0xe8
really_probe+0xc8/0x3a0
__driver_probe_device+0x84/0x160
driver_probe_device+0x44/0x130
__driver_attach+0xcc/0x208
bus_for_each_dev+0x84/0x100
driver_attach+0x2c/0x40
bus_add_driver+0x158/0x290
driver_register+0x70/0x138
__platform_driver_register+0x2c/0x40
arm_smmu_driver_init+0x28/0x40
do_one_initcall+0x60/0x318
do_initcalls+0x198/0x1e0
kernel_init_freeable+0x18c/0x1e8
kernel_init+0x28/0x160
ret_from_fork+0x10/0x20
Using 64 bit immediate when doing shift can solve the problem. The
disssembly after the fix looks like:
ldr w20, [x19, 828] //, smmu_7(D)->sid_bits
mov x0, 1
lsl x0, x0, x20
Signed-off-by: Yang Shi <yang@os.amperecomputing.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 737c5b882355..01a2faee04bc 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -3625,7 +3625,7 @@ static int arm_smmu_init_strtab_2lvl(struct arm_smmu_device *smmu)
u32 l1size;
struct arm_smmu_strtab_cfg *cfg = &smmu->strtab_cfg;
unsigned int last_sid_idx =
- arm_smmu_strtab_l1_idx((1 << smmu->sid_bits) - 1);
+ arm_smmu_strtab_l1_idx((1UL << smmu->sid_bits) - 1);
/* Calculate the L1 size, capped to the SIDSIZE. */
cfg->l2.num_l1_ents = min(last_sid_idx + 1, STRTAB_MAX_L1_ENTRIES);
--
2.41.0
Hello!
On 01/10/2024 19:03, Yang Shi wrote:
> The commit ce410410f1a7 ("iommu/arm-smmu-v3: Add arm_smmu_strtab_l1/2_idx()")
> calculated the last index of L1 stream table by 1 << smmu->sid_bits. 1
> is 32 bit value.
> However AmpereOne has 32-bit stream id size. This resulted in
> ouf-of-bound shift. The disassembly of shift is:
>
> ldr w2, [x19, 828] //, smmu_7(D)->sid_bits
> mov w20, 1
> lsl w20, w20, w2
>
> According to ARM spec, if the registers are 32 bit, the instruction actually
> does:
> dest = src << (shift % 32)
>
> So it actually shifted by zero bit.
>
> This caused v6.12-rc1 failed to boot on AmpereOne
Same here - one of arm's reference designs prints 1 giga-tonne of:
| arm-smmu-v3 arm-smmu-v3.5.auto: event 0x02 received:
| arm-smmu-v3 arm-smmu-v3.5.auto: 0x0000c90000000002
| arm-smmu-v3 arm-smmu-v3.5.auto: 0x0000000000000000
| arm-smmu-v3 arm-smmu-v3.5.auto: 0x0000000000000000
| arm-smmu-v3 arm-smmu-v3.5.auto: 0x0000000000000000
during boot - then fails to find the nvme.
Bisect points to ce410410f1a7, and the below diff fixes it.
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index 737c5b882355..01a2faee04bc 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -3625,7 +3625,7 @@ static int arm_smmu_init_strtab_2lvl(struct arm_smmu_device *smmu)
> u32 l1size;
> struct arm_smmu_strtab_cfg *cfg = &smmu->strtab_cfg;
> unsigned int last_sid_idx =
> - arm_smmu_strtab_l1_idx((1 << smmu->sid_bits) - 1);
> + arm_smmu_strtab_l1_idx((1UL << smmu->sid_bits) - 1);
>
> /* Calculate the L1 size, capped to the SIDSIZE. */
> cfg->l2.num_l1_ents = min(last_sid_idx + 1, STRTAB_MAX_L1_ENTRIES);
Tested-by: James Morse <james.morse@arm.com>
Thanks,
James
On 10/2/24 9:58 AM, James Morse wrote:
> Hello!
>
> On 01/10/2024 19:03, Yang Shi wrote:
>> The commit ce410410f1a7 ("iommu/arm-smmu-v3: Add arm_smmu_strtab_l1/2_idx()")
>> calculated the last index of L1 stream table by 1 << smmu->sid_bits. 1
>> is 32 bit value.
>> However AmpereOne has 32-bit stream id size. This resulted in
>> ouf-of-bound shift. The disassembly of shift is:
>>
>> ldr w2, [x19, 828] //, smmu_7(D)->sid_bits
>> mov w20, 1
>> lsl w20, w20, w2
>>
>> According to ARM spec, if the registers are 32 bit, the instruction actually
>> does:
>> dest = src << (shift % 32)
>>
>> So it actually shifted by zero bit.
>>
>> This caused v6.12-rc1 failed to boot on AmpereOne
> Same here - one of arm's reference designs prints 1 giga-tonne of:
> | arm-smmu-v3 arm-smmu-v3.5.auto: event 0x02 received:
> | arm-smmu-v3 arm-smmu-v3.5.auto: 0x0000c90000000002
> | arm-smmu-v3 arm-smmu-v3.5.auto: 0x0000000000000000
> | arm-smmu-v3 arm-smmu-v3.5.auto: 0x0000000000000000
> | arm-smmu-v3 arm-smmu-v3.5.auto: 0x0000000000000000
>
> during boot - then fails to find the nvme.
> Bisect points to ce410410f1a7, and the below diff fixes it.
>
>> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
>> index 737c5b882355..01a2faee04bc 100644
>> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
>> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
>> @@ -3625,7 +3625,7 @@ static int arm_smmu_init_strtab_2lvl(struct arm_smmu_device *smmu)
>> u32 l1size;
>> struct arm_smmu_strtab_cfg *cfg = &smmu->strtab_cfg;
>> unsigned int last_sid_idx =
>> - arm_smmu_strtab_l1_idx((1 << smmu->sid_bits) - 1);
>> + arm_smmu_strtab_l1_idx((1UL << smmu->sid_bits) - 1);
>>
>> /* Calculate the L1 size, capped to the SIDSIZE. */
>> cfg->l2.num_l1_ents = min(last_sid_idx + 1, STRTAB_MAX_L1_ENTRIES);
>
> Tested-by: James Morse <james.morse@arm.com>
Thank you for testing. I will change the patch subject a little bit to
make it more general.
>
>
> Thanks,
>
> James
On Tue, Oct 01, 2024 at 11:03:46AM -0700, Yang Shi wrote: > Using 64 bit immediate when doing shift can solve the problem. The > disssembly after the fix looks like: [...] > unsigned int last_sid_idx = > - arm_smmu_strtab_l1_idx((1 << smmu->sid_bits) - 1); > + arm_smmu_strtab_l1_idx((1UL << smmu->sid_bits) - 1); Could a 32-bit build be a corner case where UL is no longer a "64 bit" stated in the commit message? Also, there are other places doing "1 << smmu->sid_bits", e.g. arm_smmu_init_strtab_linear(). Then, can ssid_bits/s1cdmax be a concern similarly? Thanks Nicolin
On 10/1/24 11:27 AM, Nicolin Chen wrote: > On Tue, Oct 01, 2024 at 11:03:46AM -0700, Yang Shi wrote: >> Using 64 bit immediate when doing shift can solve the problem. The >> disssembly after the fix looks like: > [...] > >> unsigned int last_sid_idx = >> - arm_smmu_strtab_l1_idx((1 << smmu->sid_bits) - 1); >> + arm_smmu_strtab_l1_idx((1UL << smmu->sid_bits) - 1); > Could a 32-bit build be a corner case where UL is no longer a > "64 bit" stated in the commit message? It shouldn't. Because smmu v3 depends on ARM64. config ARM_SMMU_V3 tristate "ARM Ltd. System MMU Version 3 (SMMUv3) Support" depends on ARM64 > > Also, there are other places doing "1 << smmu->sid_bits", e.g. > arm_smmu_init_strtab_linear(). The disassembly shows it uses "sbfiz x21, x20, 6, 32" instead of lsl. 1UL should be used if we want to take extra caution and don't prefer rely on compiler. > > Then, can ssid_bits/s1cdmax be a concern similarly? IIUC, ssid_bits is determined by IDR1_SSIDSIZE. It is GENMASK(10, 6). So it shouldn't be 32. IDR1_SIDSIZE is GENMASK(5, 0). > > Thanks > Nicolin
On Tue, Oct 01, 2024 at 12:09:03PM -0700, Yang Shi wrote: > On 10/1/24 11:27 AM, Nicolin Chen wrote: > > On Tue, Oct 01, 2024 at 11:03:46AM -0700, Yang Shi wrote: > > > Using 64 bit immediate when doing shift can solve the problem. The > > > disssembly after the fix looks like: > > [...] > > > > > unsigned int last_sid_idx = > > > - arm_smmu_strtab_l1_idx((1 << smmu->sid_bits) - 1); > > > + arm_smmu_strtab_l1_idx((1UL << smmu->sid_bits) - 1); > > Could a 32-bit build be a corner case where UL is no longer a > > "64 bit" stated in the commit message? > > It shouldn't. Because smmu v3 depends on ARM64. > > config ARM_SMMU_V3 > tristate "ARM Ltd. System MMU Version 3 (SMMUv3) Support" > depends on ARM64 ARM64 can have aarch32 support. I am not sure if ARM64 running a 32-bit OS can be a case though, (and not confined to AmpereOne). > > Then, can ssid_bits/s1cdmax be a concern similarly? > > IIUC, ssid_bits is determined by IDR1_SSIDSIZE. It is GENMASK(10, 6). So > it shouldn't be 32. IDR1_SIDSIZE is GENMASK(5, 0). Rechecked the RM. Yea, max sid can be 32 but max ssid is 20 at this moment, so we should be safe. Thanks Nicolin
On 10/1/24 12:29 PM, Nicolin Chen wrote: > On Tue, Oct 01, 2024 at 12:09:03PM -0700, Yang Shi wrote: >> On 10/1/24 11:27 AM, Nicolin Chen wrote: >>> On Tue, Oct 01, 2024 at 11:03:46AM -0700, Yang Shi wrote: >>>> Using 64 bit immediate when doing shift can solve the problem. The >>>> disssembly after the fix looks like: >>> [...] >>> >>>> unsigned int last_sid_idx = >>>> - arm_smmu_strtab_l1_idx((1 << smmu->sid_bits) - 1); >>>> + arm_smmu_strtab_l1_idx((1UL << smmu->sid_bits) - 1); >>> Could a 32-bit build be a corner case where UL is no longer a >>> "64 bit" stated in the commit message? >> It shouldn't. Because smmu v3 depends on ARM64. >> >> config ARM_SMMU_V3 >> tristate "ARM Ltd. System MMU Version 3 (SMMUv3) Support" >> depends on ARM64 > ARM64 can have aarch32 support. I am not sure if ARM64 running a > 32-bit OS can be a case though, (and not confined to AmpereOne). I don't think ARM64 runs 32-bit kernel, at least for newer kernel. > >>> Then, can ssid_bits/s1cdmax be a concern similarly? >> IIUC, ssid_bits is determined by IDR1_SSIDSIZE. It is GENMASK(10, 6). So >> it shouldn't be 32. IDR1_SIDSIZE is GENMASK(5, 0). > Rechecked the RM. Yea, max sid can be 32 but max ssid is 20 at > this moment, so we should be safe. > > Thanks > Nicolin
On 2024-10-01 8:48 pm, Yang Shi wrote: > > > On 10/1/24 12:29 PM, Nicolin Chen wrote: >> On Tue, Oct 01, 2024 at 12:09:03PM -0700, Yang Shi wrote: >>> On 10/1/24 11:27 AM, Nicolin Chen wrote: >>>> On Tue, Oct 01, 2024 at 11:03:46AM -0700, Yang Shi wrote: >>>>> Using 64 bit immediate when doing shift can solve the problem. The >>>>> disssembly after the fix looks like: >>>> [...] >>>> >>>>> unsigned int last_sid_idx = >>>>> - arm_smmu_strtab_l1_idx((1 << smmu->sid_bits) - 1); >>>>> + arm_smmu_strtab_l1_idx((1UL << smmu->sid_bits) - 1); >>>> Could a 32-bit build be a corner case where UL is no longer a >>>> "64 bit" stated in the commit message? >>> It shouldn't. Because smmu v3 depends on ARM64. >>> >>> config ARM_SMMU_V3 >>> tristate "ARM Ltd. System MMU Version 3 (SMMUv3) Support" >>> depends on ARM64 >> ARM64 can have aarch32 support. I am not sure if ARM64 running a >> 32-bit OS can be a case though, (and not confined to AmpereOne). > > I don't think ARM64 runs 32-bit kernel, at least for newer kernel. Just use ULL - if the point is that it must be a 64-bit shift for correctness, then being clear about that intent is far more valuable than saving one character of source code. Thanks, Robin. > >> >>>> Then, can ssid_bits/s1cdmax be a concern similarly? >>> IIUC, ssid_bits is determined by IDR1_SSIDSIZE. It is GENMASK(10, 6). So >>> it shouldn't be 32. IDR1_SIDSIZE is GENMASK(5, 0). >> Rechecked the RM. Yea, max sid can be 32 but max ssid is 20 at >> this moment, so we should be safe. >> >> Thanks >> Nicolin >
On 10/2/24 2:59 AM, Robin Murphy wrote: > On 2024-10-01 8:48 pm, Yang Shi wrote: >> >> >> On 10/1/24 12:29 PM, Nicolin Chen wrote: >>> On Tue, Oct 01, 2024 at 12:09:03PM -0700, Yang Shi wrote: >>>> On 10/1/24 11:27 AM, Nicolin Chen wrote: >>>>> On Tue, Oct 01, 2024 at 11:03:46AM -0700, Yang Shi wrote: >>>>>> Using 64 bit immediate when doing shift can solve the problem. The >>>>>> disssembly after the fix looks like: >>>>> [...] >>>>> >>>>>> unsigned int last_sid_idx = >>>>>> - arm_smmu_strtab_l1_idx((1 << smmu->sid_bits) - 1); >>>>>> + arm_smmu_strtab_l1_idx((1UL << smmu->sid_bits) - 1); >>>>> Could a 32-bit build be a corner case where UL is no longer a >>>>> "64 bit" stated in the commit message? >>>> It shouldn't. Because smmu v3 depends on ARM64. >>>> >>>> config ARM_SMMU_V3 >>>> tristate "ARM Ltd. System MMU Version 3 (SMMUv3) Support" >>>> depends on ARM64 >>> ARM64 can have aarch32 support. I am not sure if ARM64 running a >>> 32-bit OS can be a case though, (and not confined to AmpereOne). >> >> I don't think ARM64 runs 32-bit kernel, at least for newer kernel. > > Just use ULL - if the point is that it must be a 64-bit shift for > correctness, then being clear about that intent is far more valuable > than saving one character of source code. Yeah, it must be 64 bit. Will fix in v2. > > Thanks, > Robin. > >> >>> >>>>> Then, can ssid_bits/s1cdmax be a concern similarly? >>>> IIUC, ssid_bits is determined by IDR1_SSIDSIZE. It is GENMASK(10, >>>> 6). So >>>> it shouldn't be 32. IDR1_SIDSIZE is GENMASK(5, 0). >>> Rechecked the RM. Yea, max sid can be 32 but max ssid is 20 at >>> this moment, so we should be safe. >>> >>> Thanks >>> Nicolin >>
On Tue, Oct 01, 2024 at 12:09:03PM -0700, Yang Shi wrote: > > Also, there are other places doing "1 << smmu->sid_bits", e.g. > > arm_smmu_init_strtab_linear(). > > The disassembly shows it uses "sbfiz x21, x20, 6, 32" instead of lsl. 1UL > should be used if we want to take extra caution and don't prefer rely on > compiler. Still, we should be fixing them all if sid_bits == 32, all those shifts should be throwing a UBSAN error. It would be crazy to have a 32 bit linear table but I guess it is allowed? Jason
On 10/1/24 12:18 PM, Jason Gunthorpe wrote: > On Tue, Oct 01, 2024 at 12:09:03PM -0700, Yang Shi wrote: >>> Also, there are other places doing "1 << smmu->sid_bits", e.g. >>> arm_smmu_init_strtab_linear(). >> The disassembly shows it uses "sbfiz x21, x20, 6, 32" instead of lsl. 1UL >> should be used if we want to take extra caution and don't prefer rely on >> compiler. > Still, we should be fixing them all if sid_bits == 32, all those > shifts should be throwing a UBSAN error. It would be crazy to have a OK, will cover this is v2. > 32 bit linear table but I guess it is allowed? I'm not smmu expert, but if sid_bits is 32, it looks like the linear table will consume 256GB IIUC? That is crazy. > > Jason
On Tue, Oct 01, 2024 at 12:38:56PM -0700, Yang Shi wrote: > > > On 10/1/24 12:18 PM, Jason Gunthorpe wrote: > > On Tue, Oct 01, 2024 at 12:09:03PM -0700, Yang Shi wrote: > > > > Also, there are other places doing "1 << smmu->sid_bits", e.g. > > > > arm_smmu_init_strtab_linear(). > > > The disassembly shows it uses "sbfiz x21, x20, 6, 32" instead of lsl. 1UL > > > should be used if we want to take extra caution and don't prefer rely on > > > compiler. > > Still, we should be fixing them all if sid_bits == 32, all those > > shifts should be throwing a UBSAN error. It would be crazy to have a > > OK, will cover this is v2. Maybe just make a little inline function to do this math and remove the repated open coding? Then the types can be right, etc. Jason
On 10/1/24 12:46 PM, Jason Gunthorpe wrote:
> On Tue, Oct 01, 2024 at 12:38:56PM -0700, Yang Shi wrote:
>>
>> On 10/1/24 12:18 PM, Jason Gunthorpe wrote:
>>> On Tue, Oct 01, 2024 at 12:09:03PM -0700, Yang Shi wrote:
>>>>> Also, there are other places doing "1 << smmu->sid_bits", e.g.
>>>>> arm_smmu_init_strtab_linear().
>>>> The disassembly shows it uses "sbfiz x21, x20, 6, 32" instead of lsl. 1UL
>>>> should be used if we want to take extra caution and don't prefer rely on
>>>> compiler.
>>> Still, we should be fixing them all if sid_bits == 32, all those
>>> shifts should be throwing a UBSAN error. It would be crazy to have a
>> OK, will cover this is v2.
> Maybe just make a little inline function to do this math and remove
> the repated open coding? Then the types can be right, etc.
Something like this?
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 01a2faee04bc..0f3aa7962117 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -3624,8 +3624,9 @@ static int arm_smmu_init_strtab_2lvl(struct
arm_smmu_device *smmu)
{
u32 l1size;
struct arm_smmu_strtab_cfg *cfg = &smmu->strtab_cfg;
+ unsigned int max_sid = arm_smmu_strtab_max_sid(smmu->sid_bits);
unsigned int last_sid_idx =
- arm_smmu_strtab_l1_idx((1UL << smmu->sid_bits) - 1);
+ arm_smmu_strtab_l1_idx(max_sid - 1);
/* Calculate the L1 size, capped to the SIDSIZE. */
cfg->l2.num_l1_ents = min(last_sid_idx + 1, STRTAB_MAX_L1_ENTRIES);
@@ -3657,8 +3658,9 @@ static int arm_smmu_init_strtab_linear(struct
arm_smmu_device *smmu)
{
u32 size;
struct arm_smmu_strtab_cfg *cfg = &smmu->strtab_cfg;
+ unsigned int max_sid = arm_smmu_strtab_max_sid(smmu->sid_bits);
- size = (1 << smmu->sid_bits) * sizeof(struct arm_smmu_ste);
+ size = max_sid * sizeof(struct arm_smmu_ste);
cfg->linear.table = dmam_alloc_coherent(smmu->dev, size,
&cfg->linear.ste_dma,
GFP_KERNEL);
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 1e9952ca989f..de9f14293485 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -235,6 +235,11 @@ static inline u32 arm_smmu_strtab_l2_idx(u32 sid)
return sid % STRTAB_NUM_L2_STES;
}
+static inline unsigned int arm_smmu_strtab_max_sid(unsigned int sid_bits)
+{
+ return (1UL << sid_bits);
+}
+
>
> Jason
Hello,
On 01/10/2024 21:47, Yang Shi wrote:
> On 10/1/24 12:46 PM, Jason Gunthorpe wrote:
>> On Tue, Oct 01, 2024 at 12:38:56PM -0700, Yang Shi wrote:
>>>
>>> On 10/1/24 12:18 PM, Jason Gunthorpe wrote:
>>>> On Tue, Oct 01, 2024 at 12:09:03PM -0700, Yang Shi wrote:
>>>>>> Also, there are other places doing "1 << smmu->sid_bits", e.g.
>>>>>> arm_smmu_init_strtab_linear().
>>>>> The disassembly shows it uses "sbfiz x21, x20, 6, 32" instead of lsl. 1UL
>>>>> should be used if we want to take extra caution and don't prefer rely on
>>>>> compiler.
>>>> Still, we should be fixing them all if sid_bits == 32, all those
>>>> shifts should be throwing a UBSAN error. It would be crazy to have a
>>> OK, will cover this is v2.
>> Maybe just make a little inline function to do this math and remove
>> the repated open coding? Then the types can be right, etc.
>
> Something like this?
>
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index 01a2faee04bc..0f3aa7962117 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -3624,8 +3624,9 @@ static int arm_smmu_init_strtab_2lvl(struct arm_smmu_device *smmu)
> {
> u32 l1size;
> struct arm_smmu_strtab_cfg *cfg = &smmu->strtab_cfg;
> + unsigned int max_sid = arm_smmu_strtab_max_sid(smmu->sid_bits);
> unsigned int last_sid_idx =
> - arm_smmu_strtab_l1_idx((1UL << smmu->sid_bits) - 1);
> + arm_smmu_strtab_l1_idx(max_sid - 1);
>
> /* Calculate the L1 size, capped to the SIDSIZE. */
> cfg->l2.num_l1_ents = min(last_sid_idx + 1, STRTAB_MAX_L1_ENTRIES);
> @@ -3657,8 +3658,9 @@ static int arm_smmu_init_strtab_linear(struct arm_smmu_device *smmu)
> {
> u32 size;
> struct arm_smmu_strtab_cfg *cfg = &smmu->strtab_cfg;
> + unsigned int max_sid = arm_smmu_strtab_max_sid(smmu->sid_bits);
>
> - size = (1 << smmu->sid_bits) * sizeof(struct arm_smmu_ste);
> + size = max_sid * sizeof(struct arm_smmu_ste);
> cfg->linear.table = dmam_alloc_coherent(smmu->dev, size,
> &cfg->linear.ste_dma,
> GFP_KERNEL);
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> index 1e9952ca989f..de9f14293485 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> @@ -235,6 +235,11 @@ static inline u32 arm_smmu_strtab_l2_idx(u32 sid)
> return sid % STRTAB_NUM_L2_STES;
> }
>
> +static inline unsigned int arm_smmu_strtab_max_sid(unsigned int sid_bits)
> +{
> + return (1UL << sid_bits);
> +}
> +
On the same platform - this also fixes the issue:
Tested-by: James Morse <james.morse@arm.com>
Thanks,
James
On Tue, Oct 01, 2024 at 01:47:24PM -0700, Yang Shi wrote:
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> index 1e9952ca989f..de9f14293485 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> @@ -235,6 +235,11 @@ static inline u32 arm_smmu_strtab_l2_idx(u32 sid)
> return sid % STRTAB_NUM_L2_STES;
> }
>
> +static inline unsigned int arm_smmu_strtab_max_sid(unsigned int sid_bits)
> +{
Can we take the smmu struct here instead of the int?
But yes, this looks OK
Jason
On 10/1/24 2:02 PM, Jason Gunthorpe wrote:
> On Tue, Oct 01, 2024 at 01:47:24PM -0700, Yang Shi wrote:
>> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
>> b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
>> index 1e9952ca989f..de9f14293485 100644
>> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
>> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
>> @@ -235,6 +235,11 @@ static inline u32 arm_smmu_strtab_l2_idx(u32 sid)
>> return sid % STRTAB_NUM_L2_STES;
>> }
>>
>> +static inline unsigned int arm_smmu_strtab_max_sid(unsigned int sid_bits)
>> +{
> Can we take the smmu struct here instead of the int?
No problem.
>
> But yes, this looks OK
Thank you.
>
> Jason
On 10/1/24 12:46 PM, Jason Gunthorpe wrote: > On Tue, Oct 01, 2024 at 12:38:56PM -0700, Yang Shi wrote: >> >> On 10/1/24 12:18 PM, Jason Gunthorpe wrote: >>> On Tue, Oct 01, 2024 at 12:09:03PM -0700, Yang Shi wrote: >>>>> Also, there are other places doing "1 << smmu->sid_bits", e.g. >>>>> arm_smmu_init_strtab_linear(). >>>> The disassembly shows it uses "sbfiz x21, x20, 6, 32" instead of lsl. 1UL >>>> should be used if we want to take extra caution and don't prefer rely on >>>> compiler. >>> Still, we should be fixing them all if sid_bits == 32, all those >>> shifts should be throwing a UBSAN error. It would be crazy to have a >> OK, will cover this is v2. > Maybe just make a little inline function to do this math and remove > the repated open coding? Then the types can be right, etc. Fine to me. > > Jason
© 2016 - 2026 Red Hat, Inc.