[PATCH v10 0/2] Dynamic Allocation of the reserved_mem array

Oreoluwa Babatunde posted 2 patches 1 month, 2 weeks ago
drivers/of/fdt.c             |   5 +-
drivers/of/of_private.h      |   3 +-
drivers/of/of_reserved_mem.c | 227 +++++++++++++++++++++++++++--------
3 files changed, 179 insertions(+), 56 deletions(-)
[PATCH v10 0/2] Dynamic Allocation of the reserved_mem array
Posted by Oreoluwa Babatunde 1 month, 2 weeks ago
The reserved_mem array is used to store data for the different
reserved memory regions defined in the DT of a device.  The array
stores information such as region name, node reference, start-address,
and size of the different reserved memory regions.

The array is currently statically allocated with a size of
MAX_RESERVED_REGIONS(64). This means that any system that specifies a
number of reserved memory regions greater than MAX_RESERVED_REGIONS(64)
will not have enough space to store the information for all the regions.

This can be fixed by making the reserved_mem array a dynamically sized
array which is allocated using memblock_alloc() based on the exact
number of reserved memory regions defined in the DT.

On architectures such as arm64, memblock allocated memory is not
writable until after the page tables have been setup.
This is an issue because the current implementation initializes the
reserved memory regions and stores their information in the array before
the page tables are setup. Hence, dynamically allocating the
reserved_mem array and attempting to write information to it at this
point will fail.

Therefore, the allocation of the reserved_mem array will need to be done
after the page tables have been setup, which means that the reserved
memory regions will also need to wait until after the page tables have
been setup to be stored in the array.

When processing the reserved memory regions defined in the DT, these
regions are marked as reserved by calling memblock_reserve(base, size).
Where:  base = base address of the reserved region.
        size = the size of the reserved memory region.

Depending on if that region is defined using the "no-map" property,
memblock_mark_nomap(base, size) is also called.

The "no-map" property is used to indicate to the operating system that a
mapping of the specified region must NOT be created. This also means
that no access (including speculative accesses) is allowed on this
region of memory except when it is coming from the device driver that
this region of memory is being reserved for.[1]

Therefore, it is important to call memblock_reserve() and
memblock_mark_nomap() on all the reserved memory regions before the
system sets up the page tables so that the system does not unknowingly
include any of the no-map reserved memory regions in the memory map.

There are two ways to define how/where a reserved memory region is
placed in memory:
i) Statically-placed reserved memory regions
i.e. regions defined with a set start address and size using the
     "reg" property in the DT.
ii) Dynamically-placed reserved memory regions.
i.e. regions defined by specifying a range of addresses where they can
     be placed in memory using the "alloc_ranges" and "size" properties
     in the DT.

The dynamically-placed reserved memory regions get assigned a start
address only at runtime. And this needs to  be done before the page
tables are setup so that memblock_reserve() and memblock_mark_nomap()
can be called on the allocated region as explained above.
Since the dynamically allocated reserved_mem array can only be
available after the page tables have been setup, the information for
the dynamically-placed reserved memory regions needs to be stored
somewhere temporarily until the reserved_mem array is available.

Therefore, this series makes use of a temporary static array to store
the information of the dynamically-placed reserved memory regions until
the reserved_mem array is allocated.
Once the reserved_mem array is available, the information is copied over
from the temporary array into the reserved_mem array, and the memory for
the temporary array is freed back to the system.

The information for the statically-placed reserved memory regions does
not need to be stored in a temporary array because their starting
address is already stored in the devicetree.
Once the reserved_mem array is allocated, the information for the
statically-placed reserved memory regions is added to the array.

Note:
Because of the use of a temporary array to store the information of the
dynamically-placed reserved memory regions, there still exists a
limitation of 64 for this particular kind of reserved memory regions.
From my observation, these regions are typically small in number and
hence I expect this to not be an issue for now.

Patch Versions:

v10:
- Rebase patchset on v6.12-rc2.

v9:
- fix issue reported from v8:
  https://lore.kernel.org/all/DU0PR04MB92999E9EEE959DBC3B1EAB6E80932@DU0PR04MB9299.eurprd04.prod.outlook.com/
  In v8, the rmem struct being passed into __reserved_mem_init_node()
  was not the same as what was being stored in the reserved_mem array.
  As a result, information such as rmem->ops was not being stored in
  the array for these regions.
  Make changes to pass the same reserved_mem struct into
  __reserved_mem_init_node() as what is being stored in the reserved_mem
  array.

v8:
https://lore.kernel.org/all/20240830162857.2821502-1-quic_obabatun@quicinc.com/
- Check the value of initial_boot_params in
  fdt_scan_reserved_mem_reg_nodes() to avoid breakage on architectures
  where this is not being used as was found to be the case for x86 in
  the issues reported below:
  https://lore.kernel.org/all/202408192157.8d8fe8a9-oliver.sang@intel.com/
  https://lore.kernel.org/all/ZsN_p9l8Pw2_X3j3@black.fi.intel.com/

v7:
https://lore.kernel.org/all/20240809184814.2703050-1-quic_obabatun@quicinc.com/
- Make changes to initialize the reserved memory regions earlier in
  response to issue reported in v6:
  https://lore.kernel.org/all/20240610213403.GA1697364@thelio-3990X/

- For the reserved regions to be setup properly,
  fdt_init_reserved_mem_node() needs to be called on each of the regions
  before the page tables are setup. Since the function requires a
  refernece to the devicetree node of each region, we are not able to
  use the unflattened_devicetree APIs since they are not available until
  after the page tables have been setup.
  Hence, revert the use of the unflatten_device APIs as a result of this
  limitation which was discovered in v6:
  https://lore.kernel.org/all/986361f4-f000-4129-8214-39f2fb4a90da@gmail.com/
  https://lore.kernel.org/all/DU0PR04MB9299C3EC247E1FE2C373440F80DE2@DU0PR04MB9299.eurprd04.prod.outlook.com/

v6:
https://lore.kernel.org/all/20240528223650.619532-1-quic_obabatun@quicinc.com/
- Rebased patchset on top of v6.10-rc1.
- Addressed comments received in v5 such as:
  1. Switched to using relevant typed functions such as
     of_property_read_u32(), of_property_present(), etc.
  2. Switched to using of_address_to_resource() to read the "reg"
     property of nodes.
  3. Renamed functions using "of_*" naming scheme instead of "dt_*".

v5:
https://lore.kernel.org/all/20240328211543.191876-1-quic_obabatun@quicinc.com/
- Rebased changes on top of v6.9-rc1.
- Addressed minor code comments from v4.

v4:
https://lore.kernel.org/all/20240308191204.819487-2-quic_obabatun@quicinc.com/
- Move fdt_init_reserved_mem() back into the unflatten_device_tree()
  function.
- Fix warnings found by Kernel test robot:
  https://lore.kernel.org/all/202401281219.iIhqs1Si-lkp@intel.com/
  https://lore.kernel.org/all/202401281304.tsu89Kcm-lkp@intel.com/
  https://lore.kernel.org/all/202401291128.e7tdNh5x-lkp@intel.com/

v3:
https://lore.kernel.org/all/20240126235425.12233-1-quic_obabatun@quicinc.com/
- Make use of __initdata to delete the temporary static array after
  dynamically allocating memory for reserved_mem array using memblock.
- Move call to fdt_init_reserved_mem() out of the
  unflatten_device_tree() function and into architecture specific setup
  code.
- Breaking up the changes for the individual architectures into separate
  patches.

v2:
https://lore.kernel.org/all/20231204041339.9902-1-quic_obabatun@quicinc.com/
- Extend changes to all other relevant architectures by moving
  fdt_init_reserved_mem() into the unflatten_device_tree() function.
- Add code to use unflatten devicetree APIs to process the reserved
  memory regions.

v1:
https://lore.kernel.org/all/20231019184825.9712-1-quic_obabatun@quicinc.com/

References:
[1] https://github.com/devicetree-org/dt-schema/blob/main/dtschema/schemas/reserved-memory/reserved-memory.yaml#L79

Oreoluwa Babatunde (2):
  of: reserved_mem: Restruture how the reserved memory regions are
    processed
  of: reserved_mem: Add code to dynamically allocate reserved_mem array

 drivers/of/fdt.c             |   5 +-
 drivers/of/of_private.h      |   3 +-
 drivers/of/of_reserved_mem.c | 227 +++++++++++++++++++++++++++--------
 3 files changed, 179 insertions(+), 56 deletions(-)

-- 
2.34.1
Re: [PATCH v10 0/2] Dynamic Allocation of the reserved_mem array
Posted by Rob Herring 1 month, 1 week ago
On Tue, Oct 08, 2024 at 03:06:22PM -0700, Oreoluwa Babatunde wrote:
> The reserved_mem array is used to store data for the different
> reserved memory regions defined in the DT of a device.  The array
> stores information such as region name, node reference, start-address,
> and size of the different reserved memory regions.
> 
> The array is currently statically allocated with a size of
> MAX_RESERVED_REGIONS(64). This means that any system that specifies a
> number of reserved memory regions greater than MAX_RESERVED_REGIONS(64)
> will not have enough space to store the information for all the regions.
> 
> This can be fixed by making the reserved_mem array a dynamically sized
> array which is allocated using memblock_alloc() based on the exact
> number of reserved memory regions defined in the DT.
> 
> On architectures such as arm64, memblock allocated memory is not
> writable until after the page tables have been setup.
> This is an issue because the current implementation initializes the
> reserved memory regions and stores their information in the array before
> the page tables are setup. Hence, dynamically allocating the
> reserved_mem array and attempting to write information to it at this
> point will fail.
> 
> Therefore, the allocation of the reserved_mem array will need to be done
> after the page tables have been setup, which means that the reserved
> memory regions will also need to wait until after the page tables have
> been setup to be stored in the array.
> 
> When processing the reserved memory regions defined in the DT, these
> regions are marked as reserved by calling memblock_reserve(base, size).
> Where:  base = base address of the reserved region.
>         size = the size of the reserved memory region.
> 
> Depending on if that region is defined using the "no-map" property,
> memblock_mark_nomap(base, size) is also called.
> 
> The "no-map" property is used to indicate to the operating system that a
> mapping of the specified region must NOT be created. This also means
> that no access (including speculative accesses) is allowed on this
> region of memory except when it is coming from the device driver that
> this region of memory is being reserved for.[1]
> 
> Therefore, it is important to call memblock_reserve() and
> memblock_mark_nomap() on all the reserved memory regions before the
> system sets up the page tables so that the system does not unknowingly
> include any of the no-map reserved memory regions in the memory map.
> 
> There are two ways to define how/where a reserved memory region is
> placed in memory:
> i) Statically-placed reserved memory regions
> i.e. regions defined with a set start address and size using the
>      "reg" property in the DT.
> ii) Dynamically-placed reserved memory regions.
> i.e. regions defined by specifying a range of addresses where they can
>      be placed in memory using the "alloc_ranges" and "size" properties
>      in the DT.
> 
> The dynamically-placed reserved memory regions get assigned a start
> address only at runtime. And this needs to  be done before the page
> tables are setup so that memblock_reserve() and memblock_mark_nomap()
> can be called on the allocated region as explained above.
> Since the dynamically allocated reserved_mem array can only be
> available after the page tables have been setup, the information for
> the dynamically-placed reserved memory regions needs to be stored
> somewhere temporarily until the reserved_mem array is available.
> 
> Therefore, this series makes use of a temporary static array to store
> the information of the dynamically-placed reserved memory regions until
> the reserved_mem array is allocated.
> Once the reserved_mem array is available, the information is copied over
> from the temporary array into the reserved_mem array, and the memory for
> the temporary array is freed back to the system.
> 
> The information for the statically-placed reserved memory regions does
> not need to be stored in a temporary array because their starting
> address is already stored in the devicetree.
> Once the reserved_mem array is allocated, the information for the
> statically-placed reserved memory regions is added to the array.
> 
> Note:
> Because of the use of a temporary array to store the information of the
> dynamically-placed reserved memory regions, there still exists a
> limitation of 64 for this particular kind of reserved memory regions.
> >From my observation, these regions are typically small in number and
> hence I expect this to not be an issue for now.
> 
> Patch Versions:
> 
> v10:
> - Rebase patchset on v6.12-rc2.
> 
> v9:
> - fix issue reported from v8:
>   https://lore.kernel.org/all/DU0PR04MB92999E9EEE959DBC3B1EAB6E80932@DU0PR04MB9299.eurprd04.prod.outlook.com/
>   In v8, the rmem struct being passed into __reserved_mem_init_node()
>   was not the same as what was being stored in the reserved_mem array.
>   As a result, information such as rmem->ops was not being stored in
>   the array for these regions.
>   Make changes to pass the same reserved_mem struct into
>   __reserved_mem_init_node() as what is being stored in the reserved_mem
>   array.
> 
> v8:
> https://lore.kernel.org/all/20240830162857.2821502-1-quic_obabatun@quicinc.com/
> - Check the value of initial_boot_params in
>   fdt_scan_reserved_mem_reg_nodes() to avoid breakage on architectures
>   where this is not being used as was found to be the case for x86 in
>   the issues reported below:
>   https://lore.kernel.org/all/202408192157.8d8fe8a9-oliver.sang@intel.com/
>   https://lore.kernel.org/all/ZsN_p9l8Pw2_X3j3@black.fi.intel.com/
> 
> v7:
> https://lore.kernel.org/all/20240809184814.2703050-1-quic_obabatun@quicinc.com/
> - Make changes to initialize the reserved memory regions earlier in
>   response to issue reported in v6:
>   https://lore.kernel.org/all/20240610213403.GA1697364@thelio-3990X/
> 
> - For the reserved regions to be setup properly,
>   fdt_init_reserved_mem_node() needs to be called on each of the regions
>   before the page tables are setup. Since the function requires a
>   refernece to the devicetree node of each region, we are not able to
>   use the unflattened_devicetree APIs since they are not available until
>   after the page tables have been setup.
>   Hence, revert the use of the unflatten_device APIs as a result of this
>   limitation which was discovered in v6:
>   https://lore.kernel.org/all/986361f4-f000-4129-8214-39f2fb4a90da@gmail.com/
>   https://lore.kernel.org/all/DU0PR04MB9299C3EC247E1FE2C373440F80DE2@DU0PR04MB9299.eurprd04.prod.outlook.com/
> 
> v6:
> https://lore.kernel.org/all/20240528223650.619532-1-quic_obabatun@quicinc.com/
> - Rebased patchset on top of v6.10-rc1.
> - Addressed comments received in v5 such as:
>   1. Switched to using relevant typed functions such as
>      of_property_read_u32(), of_property_present(), etc.
>   2. Switched to using of_address_to_resource() to read the "reg"
>      property of nodes.
>   3. Renamed functions using "of_*" naming scheme instead of "dt_*".
> 
> v5:
> https://lore.kernel.org/all/20240328211543.191876-1-quic_obabatun@quicinc.com/
> - Rebased changes on top of v6.9-rc1.
> - Addressed minor code comments from v4.
> 
> v4:
> https://lore.kernel.org/all/20240308191204.819487-2-quic_obabatun@quicinc.com/
> - Move fdt_init_reserved_mem() back into the unflatten_device_tree()
>   function.
> - Fix warnings found by Kernel test robot:
>   https://lore.kernel.org/all/202401281219.iIhqs1Si-lkp@intel.com/
>   https://lore.kernel.org/all/202401281304.tsu89Kcm-lkp@intel.com/
>   https://lore.kernel.org/all/202401291128.e7tdNh5x-lkp@intel.com/
> 
> v3:
> https://lore.kernel.org/all/20240126235425.12233-1-quic_obabatun@quicinc.com/
> - Make use of __initdata to delete the temporary static array after
>   dynamically allocating memory for reserved_mem array using memblock.
> - Move call to fdt_init_reserved_mem() out of the
>   unflatten_device_tree() function and into architecture specific setup
>   code.
> - Breaking up the changes for the individual architectures into separate
>   patches.
> 
> v2:
> https://lore.kernel.org/all/20231204041339.9902-1-quic_obabatun@quicinc.com/
> - Extend changes to all other relevant architectures by moving
>   fdt_init_reserved_mem() into the unflatten_device_tree() function.
> - Add code to use unflatten devicetree APIs to process the reserved
>   memory regions.
> 
> v1:
> https://lore.kernel.org/all/20231019184825.9712-1-quic_obabatun@quicinc.com/
> 
> References:
> [1] https://github.com/devicetree-org/dt-schema/blob/main/dtschema/schemas/reserved-memory/reserved-memory.yaml#L79
> 
> Oreoluwa Babatunde (2):
>   of: reserved_mem: Restruture how the reserved memory regions are
>     processed
>   of: reserved_mem: Add code to dynamically allocate reserved_mem array
> 
>  drivers/of/fdt.c             |   5 +-
>  drivers/of/of_private.h      |   3 +-
>  drivers/of/of_reserved_mem.c | 227 +++++++++++++++++++++++++++--------
>  3 files changed, 179 insertions(+), 56 deletions(-)

Applied, thanks.

Rob