[PATCH 0/4] Support dynamic (de)configuration of memory

Sumanth Korikkar posted 4 patches 4 months, 1 week ago
There is a newer version of this series
arch/s390/mm/pgalloc.c         |   2 +
arch/s390/mm/vmem.c            |  21 ++-
drivers/base/memory.c          |  23 +--
drivers/s390/char/sclp_mem.c   | 291 +++++++++++++++++++++++++++------
include/linux/memory.h         |   9 -
include/linux/memory_hotplug.h |  18 +-
include/linux/memremap.h       |   1 -
mm/memory_hotplug.c            |  17 +-
mm/sparse.c                    |   3 +-
9 files changed, 261 insertions(+), 124 deletions(-)
[PATCH 0/4] Support dynamic (de)configuration of memory
Posted by Sumanth Korikkar 4 months, 1 week ago
Hi,

Patchset provides a new interface for dynamic configuration and
deconfiguration of hotplug memory on s390, allowing with/without
memmap_on_memory support. It is a follow up on the discussion with David
when introducing memmap_on_memory support for s390 and support dynamic
(de)configuration of memory:
https://lore.kernel.org/all/ee492da8-74b4-4a97-8b24-73e07257f01d@redhat.com/
https://lore.kernel.org/all/20241202082732.3959803-1-sumanthk@linux.ibm.com/

The original motivation for introducing memmap_on_memory on s390 was to
avoid using online memory to store struct pages metadata, particularly
for standby memory blocks. This became critical in cases where there was
an imbalance between standby and online memory, potentially leading to
boot failures due to insufficient memory for metadata allocation.

To address this, memmap_on_memory was utilized on s390. However, in its
current form, it adds struct pages metadata at the start of each memory
block at the time of addition (only standby memory), and this
configuration is static. It cannot be changed at runtime  (When the user
needs continuous physical memory).

Inorder to provide more flexibility to the user and overcome the above
limitation, add an option to dynamically configure and deconfigure
hotpluggable memory block with/without memmap_on_memory.

With the new interface, s390 will not add all possible hotplug memory in
advance, like before, to make it visible in sysfs for online/offline
actions. Instead, before memory block can be set online, it has to be
configured via a new interface in /sys/firmware/memory/memoryX/config,
which makes s390 similar to others.  i.e. Adding of hotpluggable memory is
controlled by the user instead of adding it at boottime.

s390 kernel sysfs interface to configure/deconfigure memory with
memmap_on_memory (with upcoming lsmem changes):
    
* Initial memory layout:
lsmem -o RANGE,SIZE,STATE,BLOCK,CONFIGURED,MEMMAP_ON_MEMORY
RANGE                 SIZE   STATE BLOCK CONFIGURED MEMMAP_ON_MEMORY
0x00000000-0x7fffffff   2G  online 0-15  yes        no
0x80000000-0xffffffff   2G offline 16-31 no         yes

* Configure memory
echo 1 > /sys/firmware/memory/memory16/config
lsmem -o RANGE,SIZE,STATE,BLOCK,CONFIGURED,MEMMAP_ON_MEMORY
RANGE                  SIZE  STATE   BLOCK CONFIGURED MEMMAP_ON_MEMORY
0x00000000-0x7fffffff    2G  online  0-15  yes        no
0x80000000-0x87ffffff  128M offline    16  yes        yes
0x88000000-0xffffffff  1.9G offline 17-31  no         yes

* Deconfigure memory
echo 0 > /sys/firmware/memory/memory16/config
lsmem -o RANGE,SIZE,STATE,BLOCK,CONFIGURED,MEMMAP_ON_MEMORY
RANGE                 SIZE   STATE BLOCK CONFIGURED MEMMAP_ON_MEMORY
0x00000000-0x7fffffff   2G  online 0-15  yes        no
0x80000000-0xffffffff   2G offline 16-31 no         yes

* Enable memmap_on_memory and online it.
(Deconfigure first)
echo 0 > /sys/devices/system/memory/memory5/online
echo 0 > /sys/firmware/memory/memory5/config

lsmem -o RANGE,SIZE,STATE,BLOCK,CONFIGURED,MEMMAP_ON_MEMORY
RANGE                  SIZE  STATE  BLOCK CONFIGURED MEMMAP_ON_MEMORY
0x00000000-0x27ffffff  640M  online 0-4   yes        no
0x28000000-0x2fffffff  128M offline 5     no         no
0x30000000-0x7fffffff  1.3G  online 6-15  yes        no
0x80000000-0xffffffff    2G offline 16-31 no         yes

(Enable memmap_on_memory and online it)
echo 1 > /sys/firmware/memory/memory5/memmap_on_memory
echo 1 > /sys/firmware/memory/memory5/config
echo 1 > /sys/devices/system/memory/memory5/online

lsmem -o RANGE,SIZE,STATE,BLOCK,CONFIGURED,MEMMAP_ON_MEMORY
RANGE                  SIZE  STATE   BLOCK CONFIGURED MEMMAP_ON_MEMORY
0x00000000-0x27ffffff  640M  online  0-4   yes        no
0x28000000-0x2fffffff  128M  online  5     yes        yes
0x30000000-0x7fffffff  1.3G  online  6-15  yes        no
0x80000000-0xffffffff    2G  offline 16-31 no         yes

* Disable memmap_on_memory and online it.
(Deconfigure first)
echo 0 > /sys/devices/system/memory/memory5/online
echo 0 > /sys/firmware/memory/memory5/config

lsmem -o RANGE,SIZE,STATE,BLOCK,CONFIGURED,MEMMAP_ON_MEMORY
RANGE                  SIZE  STATE  BLOCK CONFIGURED MEMMAP_ON_MEMORY
0x00000000-0x27ffffff  640M  online 0-4   yes        no
0x28000000-0x2fffffff  128M offline 5     no         yes
0x30000000-0x7fffffff  1.3G  online 6-15  yes        no
0x80000000-0xffffffff    2G offline 16-31 no         yes

(Disable memmap_on_memory and online it)
echo 0 > /sys/firmware/memory/memory5/memmap_on_memory
echo 1 > /sys/firmware/memory/memory5/config
echo 1 > /sys/devices/system/memory/memory5/online

lsmem -o RANGE,SIZE,STATE,BLOCK,CONFIGURED,MEMMAP_ON_MEMORY
RANGE                  SIZE  STATE   BLOCK CONFIGURED MEMMAP_ON_MEMORY
0x00000000-0x7fffffff  2G    online  0-15  yes        no
0x80000000-0xffffffff  2G    offline 16-31 no         yes

* Userspace changes:
lsmem/chmem tool is also changed to use the new interface. I will send
it to util-linux soon.

Patch 1 adds support for removal of boot-allocated memory blocks.

Patch 2 provides option to dynamically configure and deconfigure memory
with/without memmap_on_memory.

Patch 3 removes MHP_OFFLINE_INACCESSIBLE from s390. The mhp flag was
used to mark memory as not accessible until memory hotplug online phase
begins.  However, with patch 2, it is no longer essential. Memory can be
brought to accessible state before adding memory, as the memory is added
during runttime now instead of boottime.

Patch 4 removes the MEM_PREPARE_ONLINE/MEM_FINISH_OFFLINE notifiers. It
is no longer needed.  Memory can be brought to accessible state before
adding memory now, with runtime (de)configuration of memory.

Note: The patches apply to the linux-next branch.

Thank you

Sumanth Korikkar (4):
  s390/mm: Support removal of boot-allocated virtual memory map
  s390/sclp: Add support for dynamic (de)configuration of memory
  s390/sclp: Remove MHP_OFFLINE_INACCESSIBLE
  mm/memory_hotplug: Remove MEM_PREPARE_ONLINE/MEM_FINISH_OFFLINE
    notifiers

 arch/s390/mm/pgalloc.c         |   2 +
 arch/s390/mm/vmem.c            |  21 ++-
 drivers/base/memory.c          |  23 +--
 drivers/s390/char/sclp_mem.c   | 291 +++++++++++++++++++++++++++------
 include/linux/memory.h         |   9 -
 include/linux/memory_hotplug.h |  18 +-
 include/linux/memremap.h       |   1 -
 mm/memory_hotplug.c            |  17 +-
 mm/sparse.c                    |   3 +-
 9 files changed, 261 insertions(+), 124 deletions(-)

-- 
2.48.1
Re: [PATCH 0/4] Support dynamic (de)configuration of memory
Posted by David Hildenbrand 4 months ago
On 26.09.25 15:15, Sumanth Korikkar wrote:
> Hi,

Hi,

> 
> Patchset provides a new interface for dynamic configuration and
> deconfiguration of hotplug memory on s390, allowing with/without
> memmap_on_memory support. It is a follow up on the discussion with David
> when introducing memmap_on_memory support for s390 and support dynamic
> (de)configuration of memory:
> https://lore.kernel.org/all/ee492da8-74b4-4a97-8b24-73e07257f01d@redhat.com/
> https://lore.kernel.org/all/20241202082732.3959803-1-sumanthk@linux.ibm.com/
> 
> The original motivation for introducing memmap_on_memory on s390 was to
> avoid using online memory to store struct pages metadata, particularly
> for standby memory blocks. This became critical in cases where there was
> an imbalance between standby and online memory, potentially leading to
> boot failures due to insufficient memory for metadata allocation.
> 
> To address this, memmap_on_memory was utilized on s390. However, in its
> current form, it adds struct pages metadata at the start of each memory
> block at the time of addition (only standby memory), and this
> configuration is static. It cannot be changed at runtime  (When the user
> needs continuous physical memory).
> 
> Inorder to provide more flexibility to the user and overcome the above
> limitation, add an option to dynamically configure and deconfigure
> hotpluggable memory block with/without memmap_on_memory.

This will cleanly add/remove the memory, including the directmap and 
other tracking data, so I like it.

> 
> With the new interface, s390 will not add all possible hotplug memory in
> advance, like before, to make it visible in sysfs for online/offline
> actions. Instead, before memory block can be set online, it has to be
> configured via a new interface in /sys/firmware/memory/memoryX/config,
> which makes s390 similar to others.  i.e. Adding of hotpluggable memory is
> controlled by the user instead of adding it at boottime.

Before I dig into the details, will onlining/offling still trigger 
hypervisor action, or does that now really happen when memory is 
added/removed?

That would be really nice, because it would remove the whole need for 
"standby" memory, and having to treat hotplugged memory differently 
under LPAR/z/VM than anywhere else (-> keep it offline).

> 
> s390 kernel sysfs interface to configure/deconfigure memory with
> memmap_on_memory (with upcoming lsmem changes):
>      
> * Initial memory layout:
> lsmem -o RANGE,SIZE,STATE,BLOCK,CONFIGURED,MEMMAP_ON_MEMORY
> RANGE                 SIZE   STATE BLOCK CONFIGURED MEMMAP_ON_MEMORY
> 0x00000000-0x7fffffff   2G  online 0-15  yes        no
> 0x80000000-0xffffffff   2G offline 16-31 no         yes

Could we instead modify "STATE" to reflect that it is "not added" / "not 
configured" / "disabled" etc?

Like

lsmem -o RANGE,SIZE,STATE,BLOCK,MEMMAP_ON_MEMORY
RANGE                 SIZE    STATE BLOCK
0x00000000-0x7fffffff   2G   online 0-15
0x80000000-0xffffffff   2G disabled 16-31

Or is that an attempt to maintain backwards compatibility?

> 
> * Configure memory
> echo 1 > /sys/firmware/memory/memory16/config

The granularity here is also memory_block_size_bytes(), correct?

> lsmem -o RANGE,SIZE,STATE,BLOCK,CONFIGURED,MEMMAP_ON_MEMORY
> RANGE                  SIZE  STATE   BLOCK CONFIGURED MEMMAP_ON_MEMORY
> 0x00000000-0x7fffffff    2G  online  0-15  yes        no
> 0x80000000-0x87ffffff  128M offline    16  yes        yes
> 0x88000000-0xffffffff  1.9G offline 17-31  no         yes
> 
> * Deconfigure memory
> echo 0 > /sys/firmware/memory/memory16/config
> lsmem -o RANGE,SIZE,STATE,BLOCK,CONFIGURED,MEMMAP_ON_MEMORY
> RANGE                 SIZE   STATE BLOCK CONFIGURED MEMMAP_ON_MEMORY
> 0x00000000-0x7fffffff   2G  online 0-15  yes        no
> 0x80000000-0xffffffff   2G offline 16-31 no         yes
> 
> * Enable memmap_on_memory and online it.
> (Deconfigure first)
> echo 0 > /sys/devices/system/memory/memory5/online
> echo 0 > /sys/firmware/memory/memory5/config
> 
> lsmem -o RANGE,SIZE,STATE,BLOCK,CONFIGURED,MEMMAP_ON_MEMORY
> RANGE                  SIZE  STATE  BLOCK CONFIGURED MEMMAP_ON_MEMORY
> 0x00000000-0x27ffffff  640M  online 0-4   yes        no
> 0x28000000-0x2fffffff  128M offline 5     no         no
> 0x30000000-0x7fffffff  1.3G  online 6-15  yes        no
> 0x80000000-0xffffffff    2G offline 16-31 no         yes
> 
> (Enable memmap_on_memory and online it)
> echo 1 > /sys/firmware/memory/memory5/memmap_on_memory
> echo 1 > /sys/firmware/memory/memory5/config
> echo 1 > /sys/devices/system/memory/memory5/online

I guess the use for memmap_on_memory would now be limited to making 
hotplug more likely to succeed in OOM scenarios.

> 
> lsmem -o RANGE,SIZE,STATE,BLOCK,CONFIGURED,MEMMAP_ON_MEMORY
> RANGE                  SIZE  STATE   BLOCK CONFIGURED MEMMAP_ON_MEMORY
> 0x00000000-0x27ffffff  640M  online  0-4   yes        no
> 0x28000000-0x2fffffff  128M  online  5     yes        yes
> 0x30000000-0x7fffffff  1.3G  online  6-15  yes        no
> 0x80000000-0xffffffff    2G  offline 16-31 no         yes
> 
> * Disable memmap_on_memory and online it.
> (Deconfigure first)
> echo 0 > /sys/devices/system/memory/memory5/online
> echo 0 > /sys/firmware/memory/memory5/config
> 
> lsmem -o RANGE,SIZE,STATE,BLOCK,CONFIGURED,MEMMAP_ON_MEMORY
> RANGE                  SIZE  STATE  BLOCK CONFIGURED MEMMAP_ON_MEMORY
> 0x00000000-0x27ffffff  640M  online 0-4   yes        no
> 0x28000000-0x2fffffff  128M offline 5     no         yes
> 0x30000000-0x7fffffff  1.3G  online 6-15  yes        no
> 0x80000000-0xffffffff    2G offline 16-31 no         yes
> 
> (Disable memmap_on_memory and online it)
> echo 0 > /sys/firmware/memory/memory5/memmap_on_memory
> echo 1 > /sys/firmware/memory/memory5/config
> echo 1 > /sys/devices/system/memory/memory5/online
> 
> lsmem -o RANGE,SIZE,STATE,BLOCK,CONFIGURED,MEMMAP_ON_MEMORY
> RANGE                  SIZE  STATE   BLOCK CONFIGURED MEMMAP_ON_MEMORY
> 0x00000000-0x7fffffff  2G    online  0-15  yes        no
> 0x80000000-0xffffffff  2G    offline 16-31 no         yes
> 
> * Userspace changes:
> lsmem/chmem tool is also changed to use the new interface. I will send
> it to util-linux soon.
> 
> Patch 1 adds support for removal of boot-allocated memory blocks.
> 
> Patch 2 provides option to dynamically configure and deconfigure memory
> with/without memmap_on_memory.
> 
> Patch 3 removes MHP_OFFLINE_INACCESSIBLE from s390. The mhp flag was
> used to mark memory as not accessible until memory hotplug online phase
> begins.  However, with patch 2, it is no longer essential. Memory can be
> brought to accessible state before adding memory, as the memory is added
> during runttime now instead of boottime.

Nice.

> 
> Patch 4 removes the MEM_PREPARE_ONLINE/MEM_FINISH_OFFLINE notifiers. It
> is no longer needed.  Memory can be brought to accessible state before
> adding memory now, with runtime (de)configuration of memory.

Nice.


-- 
Cheers

David / dhildenb
Re: [PATCH 0/4] Support dynamic (de)configuration of memory
Posted by Sumanth Korikkar 4 months ago
> > With the new interface, s390 will not add all possible hotplug memory in
> > advance, like before, to make it visible in sysfs for online/offline
> > actions. Instead, before memory block can be set online, it has to be
> > configured via a new interface in /sys/firmware/memory/memoryX/config,
> > which makes s390 similar to others.  i.e. Adding of hotpluggable memory is
> > controlled by the user instead of adding it at boottime.
> 
> Before I dig into the details, will onlining/offling still trigger
> hypervisor action, or does that now really happen when memory is
> added/removed?
> 
> That would be really nice, because it would remove the whole need for
> "standby" memory, and having to treat hotplugged memory differently under
> LPAR/z/VM than anywhere else (-> keep it offline).

With this approach, hypervisor actions are triggered only when memory is
actually added or removed.

Online and offline operations are common code memory hotplug actions and
the s390 memory notifier actions are none/minimal.

> > s390 kernel sysfs interface to configure/deconfigure memory with
> > memmap_on_memory (with upcoming lsmem changes):
> > * Initial memory layout:
> > lsmem -o RANGE,SIZE,STATE,BLOCK,CONFIGURED,MEMMAP_ON_MEMORY
> > RANGE                 SIZE   STATE BLOCK CONFIGURED MEMMAP_ON_MEMORY
> > 0x00000000-0x7fffffff   2G  online 0-15  yes        no
> > 0x80000000-0xffffffff   2G offline 16-31 no         yes
> 
> Could we instead modify "STATE" to reflect that it is "not added" / "not
> configured" / "disabled" etc?
> 
> Like
> 
> lsmem -o RANGE,SIZE,STATE,BLOCK,MEMMAP_ON_MEMORY
> RANGE                 SIZE    STATE BLOCK
> 0x00000000-0x7fffffff   2G   online 0-15
> 0x80000000-0xffffffff   2G disabled 16-31
> 
> Or is that an attempt to maintain backwards compatibility?

Mostly. Also, similar to lscpu output, where CPU status shows
CONFIGURED/STATE column.

Also, older scripts to get list of offline memory typically use:
lsmem | grep offline

and

chmem -e <SIZE> would work as usual, where <SIZE> specifies amount of
memory to set online.

chmem changes would look like:
chmem -c 128M -m 1 : configure memory with memmap-on-memory enabled
chmem -g 128M : deconfigure memory
chmem -e 128M : optionally configure (if supported by architecture) and
		always online memory
chmem -d 128M : offline and optionally deconfigure memory (if supported
		by architecture)

> > * Configure memory
> > echo 1 > /sys/firmware/memory/memory16/config
> 
> The granularity here is also memory_block_size_bytes(), correct?

Yes, correct.

> > lsmem -o RANGE,SIZE,STATE,BLOCK,CONFIGURED,MEMMAP_ON_MEMORY
> > RANGE                  SIZE  STATE   BLOCK CONFIGURED MEMMAP_ON_MEMORY
> > 0x00000000-0x7fffffff    2G  online  0-15  yes        no
> > 0x80000000-0x87ffffff  128M offline    16  yes        yes
> > 0x88000000-0xffffffff  1.9G offline 17-31  no         yes
> > 
> > * Deconfigure memory
> > echo 0 > /sys/firmware/memory/memory16/config
> > lsmem -o RANGE,SIZE,STATE,BLOCK,CONFIGURED,MEMMAP_ON_MEMORY
> > RANGE                 SIZE   STATE BLOCK CONFIGURED MEMMAP_ON_MEMORY
> > 0x00000000-0x7fffffff   2G  online 0-15  yes        no
> > 0x80000000-0xffffffff   2G offline 16-31 no         yes
> > 
> > * Enable memmap_on_memory and online it.
> > (Deconfigure first)
> > echo 0 > /sys/devices/system/memory/memory5/online
> > echo 0 > /sys/firmware/memory/memory5/config
> > 
> > lsmem -o RANGE,SIZE,STATE,BLOCK,CONFIGURED,MEMMAP_ON_MEMORY
> > RANGE                  SIZE  STATE  BLOCK CONFIGURED MEMMAP_ON_MEMORY
> > 0x00000000-0x27ffffff  640M  online 0-4   yes        no
> > 0x28000000-0x2fffffff  128M offline 5     no         no
> > 0x30000000-0x7fffffff  1.3G  online 6-15  yes        no
> > 0x80000000-0xffffffff    2G offline 16-31 no         yes
> > 
> > (Enable memmap_on_memory and online it)
> > echo 1 > /sys/firmware/memory/memory5/memmap_on_memory
> > echo 1 > /sys/firmware/memory/memory5/config
> > echo 1 > /sys/devices/system/memory/memory5/online
> 
> I guess the use for memmap_on_memory would now be limited to making hotplug
> more likely to succeed in OOM scenarios.

Yes. with memmap-on-memory enabled, mainly in OOM situations.

However, it also provides flexibility to the user to configure few
memory blocks with memmap-on-memory enabled and few with
memmap-on-memory disabled (When the user needs continuous physical
memory across memory blocks).

> > Patch 4 removes the MEM_PREPARE_ONLINE/MEM_FINISH_OFFLINE notifiers. It
> > is no longer needed.  Memory can be brought to accessible state before
> > adding memory now, with runtime (de)configuration of memory.
> 
> Nice.

Thank you David
Re: [PATCH 0/4] Support dynamic (de)configuration of memory
Posted by David Hildenbrand 4 months ago
On 07.10.25 19:56, Sumanth Korikkar wrote:
>>> With the new interface, s390 will not add all possible hotplug memory in
>>> advance, like before, to make it visible in sysfs for online/offline
>>> actions. Instead, before memory block can be set online, it has to be
>>> configured via a new interface in /sys/firmware/memory/memoryX/config,
>>> which makes s390 similar to others.  i.e. Adding of hotpluggable memory is
>>> controlled by the user instead of adding it at boottime.
>>
>> Before I dig into the details, will onlining/offling still trigger
>> hypervisor action, or does that now really happen when memory is
>> added/removed?
>>
>> That would be really nice, because it would remove the whole need for
>> "standby" memory, and having to treat hotplugged memory differently under
>> LPAR/z/VM than anywhere else (-> keep it offline).
> 
> With this approach, hypervisor actions are triggered only when memory is
> actually added or removed.
> 
> Online and offline operations are common code memory hotplug actions and
> the s390 memory notifier actions are none/minimal.

Very nice.

> 
>>> s390 kernel sysfs interface to configure/deconfigure memory with
>>> memmap_on_memory (with upcoming lsmem changes):
>>> * Initial memory layout:
>>> lsmem -o RANGE,SIZE,STATE,BLOCK,CONFIGURED,MEMMAP_ON_MEMORY
>>> RANGE                 SIZE   STATE BLOCK CONFIGURED MEMMAP_ON_MEMORY
>>> 0x00000000-0x7fffffff   2G  online 0-15  yes        no
>>> 0x80000000-0xffffffff   2G offline 16-31 no         yes
>>
>> Could we instead modify "STATE" to reflect that it is "not added" / "not
>> configured" / "disabled" etc?
>>
>> Like
>>
>> lsmem -o RANGE,SIZE,STATE,BLOCK,MEMMAP_ON_MEMORY
>> RANGE                 SIZE    STATE BLOCK
>> 0x00000000-0x7fffffff   2G   online 0-15
>> 0x80000000-0xffffffff   2G disabled 16-31
>>
>> Or is that an attempt to maintain backwards compatibility?
> 
> Mostly. Also, similar to lscpu output, where CPU status shows
> CONFIGURED/STATE column.

Care to share an example output? I only have a s390x VM with 2 CPUs and 
no way to configure/deconfigure.

> 
> Also, older scripts to get list of offline memory typically use:
> lsmem | grep offline
> 
> and
> 
> chmem -e <SIZE> would work as usual, where <SIZE> specifies amount of
> memory to set online.
> 
> chmem changes would look like:
> chmem -c 128M -m 1 : configure memory with memmap-on-memory enabled
> chmem -g 128M : deconfigure memory

I wonder if the above two are really required. I would expect most/all 
users to simply keep using -e / -d.

Sure, there might be some corner cases, but I would assume most people 
to not want to care about memmap-on-memory with the new model.

> chmem -e 128M : optionally configure (if supported by architecture) and
> 		always online memory
> chmem -d 128M : offline and optionally deconfigure memory (if supported
> 		by architecture)

-- 
Cheers

David / dhildenb
Re: [PATCH 0/4] Support dynamic (de)configuration of memory
Posted by Sumanth Korikkar 4 months ago
> Care to share an example output? I only have a s390x VM with 2 CPUs and no
> way to configure/deconfigure.

lscpu -e
CPU NODE DRAWER BOOK SOCKET CORE L1d:L1i:L2 ONLINE CONFIGURED POLARIZATION ADDRESS
  0    0      0    0      0    0 0:0:0         yes yes        vert-medium  0
  1    0      0    0      0    0 1:1:1         yes yes        vert-medium  1
  2    0      0    0      0    1 2:2:2         yes yes        vert-low     2
  3    0      0    0      0    1 3:3:3         yes yes        vert-low     3
  
# chcpu -d 2-3
CPU 2 disabled
CPU 3 disabled
# chcpu -g 2
CPU 2 deconfigured
# chcpu -c 2
CPU 2 configured
# chcpu -e 2-3
CPU 2 enabled
CPU 3 enabled

> > chmem changes would look like:
> > chmem -c 128M -m 1 : configure memory with memmap-on-memory enabled
> > chmem -g 128M : deconfigure memory
> 
> I wonder if the above two are really required. I would expect most/all users
> to simply keep using -e / -d.
> 
> Sure, there might be some corner cases, but I would assume most people to
> not want to care about memmap-on-memory with the new model.

I believe this remains very beneficial for customers in the following
scenario:

1) Initial memory layout:
4 GB configured online
512 GB standby

If memory_hotplug.memmap_on_memory=Y is set in the kernel command line:
Suppose user requires more memory and onlines 256 GB. With memmap-on-memory
enabled, this likely succeeds by default.

Later, the user needs 256 GB of contiguous physical memory across memory
blocks. Then, the user can still configure those memory blocks with
memmap-on-memory disabled and online it.

2) If the administrator forgets to configure
memory_hotplug.memmap_on_memory=Y, the following steps can be taken:
Rescue from OOM situations: configure with memmap-on-memory enabled, online it.

Thank you,
Sumanth
Re: [PATCH 0/4] Support dynamic (de)configuration of memory
Posted by David Hildenbrand 4 months ago
On 08.10.25 08:05, Sumanth Korikkar wrote:
>> Care to share an example output? I only have a s390x VM with 2 CPUs and no
>> way to configure/deconfigure.
> 
> lscpu -e
> CPU NODE DRAWER BOOK SOCKET CORE L1d:L1i:L2 ONLINE CONFIGURED POLARIZATION ADDRESS
>    0    0      0    0      0    0 0:0:0         yes yes        vert-medium  0
>    1    0      0    0      0    0 1:1:1         yes yes        vert-medium  1
>    2    0      0    0      0    1 2:2:2         yes yes        vert-low     2
>    3    0      0    0      0    1 3:3:3         yes yes        vert-low     3
>    
> # chcpu -d 2-3
> CPU 2 disabled
> CPU 3 disabled
> # chcpu -g 2
> CPU 2 deconfigured
> # chcpu -c 2
> CPU 2 configured
> # chcpu -e 2-3
> CPU 2 enabled
> CPU 3 enabled

Makes sense, thanks!

> 
>>> chmem changes would look like:
>>> chmem -c 128M -m 1 : configure memory with memmap-on-memory enabled
>>> chmem -g 128M : deconfigure memory
>>
>> I wonder if the above two are really required. I would expect most/all users
>> to simply keep using -e / -d.
>>
>> Sure, there might be some corner cases, but I would assume most people to
>> not want to care about memmap-on-memory with the new model.
> 
> I believe this remains very beneficial for customers in the following
> scenario:
> 
> 1) Initial memory layout:
> 4 GB configured online
> 512 GB standby
> 
> If memory_hotplug.memmap_on_memory=Y is set in the kernel command line:
> Suppose user requires more memory and onlines 256 GB. With memmap-on-memory
> enabled, this likely succeeds by default.
> 
> Later, the user needs 256 GB of contiguous physical memory across memory
> blocks. Then, the user can still configure those memory blocks with
> memmap-on-memory disabled and online it.
> 
> 2) If the administrator forgets to configure
> memory_hotplug.memmap_on_memory=Y, the following steps can be taken:
> Rescue from OOM situations: configure with memmap-on-memory enabled, online it.

That's my point: I don't consider either very likely to be used by 
actual admins.

I guess in (1) it really only is a problem with very big memory blocks. 
Assuming a memory block is just 128 MiB (or even 1 GiB), you can 
add+online them individually. Once you succeeded with the first one 
(very likely), the other ones will follow.

Sure, if you are so low on memory that you cannot even a single memory 
block, then memmap-on-memory makes sense.

But note that memmap-on-memory was added to handle hotplug of large 
chunks of memory (large DIMM/NVDIMM, large CXL device) in one go, 
without the chance to add+online individual memory blocks incrementally.

That's also the reason why I didn't care so far to implement 
memmap-on-memory support for virito-mem: as we add+online individual 
(small) emmory blocks, the implementation effort for supporting 
memmap_on_memory was so far not warranted.

(it's a bit trickier for virtio-mem to implement :) )

-- 
Cheers

David / dhildenb

Re: [PATCH 0/4] Support dynamic (de)configuration of memory
Posted by Sumanth Korikkar 4 months ago
> > > I wonder if the above two are really required. I would expect most/all users
> > > to simply keep using -e / -d.
> > > 
> > > Sure, there might be some corner cases, but I would assume most people to
> > > not want to care about memmap-on-memory with the new model.
> > 
> > I believe this remains very beneficial for customers in the following
> > scenario:
> > 
> > 1) Initial memory layout:
> > 4 GB configured online
> > 512 GB standby
> > 
> > If memory_hotplug.memmap_on_memory=Y is set in the kernel command line:
> > Suppose user requires more memory and onlines 256 GB. With memmap-on-memory
> > enabled, this likely succeeds by default.
> > 
> > Later, the user needs 256 GB of contiguous physical memory across memory
> > blocks. Then, the user can still configure those memory blocks with
> > memmap-on-memory disabled and online it.
> > 
> > 2) If the administrator forgets to configure
> > memory_hotplug.memmap_on_memory=Y, the following steps can be taken:
> > Rescue from OOM situations: configure with memmap-on-memory enabled, online it.
> 
> That's my point: I don't consider either very likely to be used by actual
> admins.
> 
> I guess in (1) it really only is a problem with very big memory blocks.
> Assuming a memory block is just 128 MiB (or even 1 GiB), you can add+online
> them individually. Once you succeeded with the first one (very likely), the
> other ones will follow.
> 
> Sure, if you are so low on memory that you cannot even a single memory
> block, then memmap-on-memory makes sense.
> 
> But note that memmap-on-memory was added to handle hotplug of large chunks
> of memory (large DIMM/NVDIMM, large CXL device) in one go, without the
> chance to add+online individual memory blocks incrementally.

Interesting. Thanks David.

Heiko suggested that memory increment size could also be upto
64GB. In that case, it might be useful.

https://lore.kernel.org/all/20250521142149.11483C95-hca@linux.ibm.com/

> That's also the reason why I didn't care so far to implement
> memmap-on-memory support for virito-mem: as we add+online individual (small)
> emmory blocks, the implementation effort for supporting memmap_on_memory was
> so far not warranted.
> 
> (it's a bit trickier for virtio-mem to implement :) )
> 
> -- 
> Cheers
> 
> David / dhildenb
> 
Re: [PATCH 0/4] Support dynamic (de)configuration of memory
Posted by David Hildenbrand 4 months ago
On 08.10.25 11:13, Sumanth Korikkar wrote:
>>>> I wonder if the above two are really required. I would expect most/all users
>>>> to simply keep using -e / -d.
>>>>
>>>> Sure, there might be some corner cases, but I would assume most people to
>>>> not want to care about memmap-on-memory with the new model.
>>>
>>> I believe this remains very beneficial for customers in the following
>>> scenario:
>>>
>>> 1) Initial memory layout:
>>> 4 GB configured online
>>> 512 GB standby
>>>
>>> If memory_hotplug.memmap_on_memory=Y is set in the kernel command line:
>>> Suppose user requires more memory and onlines 256 GB. With memmap-on-memory
>>> enabled, this likely succeeds by default.
>>>
>>> Later, the user needs 256 GB of contiguous physical memory across memory
>>> blocks. Then, the user can still configure those memory blocks with
>>> memmap-on-memory disabled and online it.
>>>
>>> 2) If the administrator forgets to configure
>>> memory_hotplug.memmap_on_memory=Y, the following steps can be taken:
>>> Rescue from OOM situations: configure with memmap-on-memory enabled, online it.
>>
>> That's my point: I don't consider either very likely to be used by actual
>> admins.
>>
>> I guess in (1) it really only is a problem with very big memory blocks.
>> Assuming a memory block is just 128 MiB (or even 1 GiB), you can add+online
>> them individually. Once you succeeded with the first one (very likely), the
>> other ones will follow.
>>
>> Sure, if you are so low on memory that you cannot even a single memory
>> block, then memmap-on-memory makes sense.
>>
>> But note that memmap-on-memory was added to handle hotplug of large chunks
>> of memory (large DIMM/NVDIMM, large CXL device) in one go, without the
>> chance to add+online individual memory blocks incrementally.
> 
> Interesting. Thanks David.
> 
> Heiko suggested that memory increment size could also be upto
> 64GB. In that case, it might be useful.

Yeha, rings a bell. But that would not be your 4GiB scenario you shared :)

-- 
Cheers

David / dhildenb

Re: [PATCH 0/4] Support dynamic (de)configuration of memory
Posted by Heiko Carstens 4 months ago
On Wed, Oct 08, 2025 at 10:02:26AM +0200, David Hildenbrand wrote:
> On 08.10.25 08:05, Sumanth Korikkar wrote:
> > > > chmem changes would look like:
> > > > chmem -c 128M -m 1 : configure memory with memmap-on-memory enabled
> > > > chmem -g 128M : deconfigure memory
> > > 
> > > I wonder if the above two are really required. I would expect most/all users
> > > to simply keep using -e / -d.
> > > 
> > > Sure, there might be some corner cases, but I would assume most people to
> > > not want to care about memmap-on-memory with the new model.

...

> > 2) If the administrator forgets to configure
> > memory_hotplug.memmap_on_memory=Y, the following steps can be taken:
> > Rescue from OOM situations: configure with memmap-on-memory enabled, online it.
> 
> That's my point: I don't consider either very likely to be used by actual
> admins.

But does it really hurt to add those options? If really needed then all of
the sudden admins would have to deal with architecture specific sysfs
layout - so the very rare emergency case becomes even more complicated.

Given that these tools exist to help that people don't have to deal with
such details, I'm much in favor of adding those options.
Re: [PATCH 0/4] Support dynamic (de)configuration of memory
Posted by David Hildenbrand 4 months ago
On 08.10.25 11:12, Heiko Carstens wrote:
> On Wed, Oct 08, 2025 at 10:02:26AM +0200, David Hildenbrand wrote:
>> On 08.10.25 08:05, Sumanth Korikkar wrote:
>>>>> chmem changes would look like:
>>>>> chmem -c 128M -m 1 : configure memory with memmap-on-memory enabled
>>>>> chmem -g 128M : deconfigure memory
>>>>
>>>> I wonder if the above two are really required. I would expect most/all users
>>>> to simply keep using -e / -d.
>>>>
>>>> Sure, there might be some corner cases, but I would assume most people to
>>>> not want to care about memmap-on-memory with the new model.
> 
> ...
> 
>>> 2) If the administrator forgets to configure
>>> memory_hotplug.memmap_on_memory=Y, the following steps can be taken:
>>> Rescue from OOM situations: configure with memmap-on-memory enabled, online it.
>>
>> That's my point: I don't consider either very likely to be used by actual
>> admins.
> 
> But does it really hurt to add those options?

Oh, I don't think so.

I was just a bit surprised to see it in the first version of this, 
because it felt to me like this is something to be added later on top 
quite easily/cleanly.

In particular, patch #2 would get a lot lighter also in terms of 
documentation.

So no strong opinion about adding it, but maybe we can just split it 
into a separate patch and focus on patch #2 on the real magic?

-- 
Cheers

David / dhildenb
Re: [PATCH 0/4] Support dynamic (de)configuration of memory
Posted by Sumanth Korikkar 4 months ago
On Fri, Sep 26, 2025 at 03:15:23PM +0200, Sumanth Korikkar wrote:
> Hi,
> 
> Patchset provides a new interface for dynamic configuration and
> deconfiguration of hotplug memory on s390, allowing with/without
> memmap_on_memory support. It is a follow up on the discussion with David
> when introducing memmap_on_memory support for s390 and support dynamic
> (de)configuration of memory:
> https://lore.kernel.org/all/ee492da8-74b4-4a97-8b24-73e07257f01d@redhat.com/
> https://lore.kernel.org/all/20241202082732.3959803-1-sumanthk@linux.ibm.com/
> 
> The original motivation for introducing memmap_on_memory on s390 was to
> avoid using online memory to store struct pages metadata, particularly
> for standby memory blocks. This became critical in cases where there was
> an imbalance between standby and online memory, potentially leading to
> boot failures due to insufficient memory for metadata allocation.
> 
> To address this, memmap_on_memory was utilized on s390. However, in its
> current form, it adds struct pages metadata at the start of each memory
> block at the time of addition (only standby memory), and this
> configuration is static. It cannot be changed at runtime  (When the user
> needs continuous physical memory).
> 
> Inorder to provide more flexibility to the user and overcome the above
> limitation, add an option to dynamically configure and deconfigure
> hotpluggable memory block with/without memmap_on_memory.
> 
> With the new interface, s390 will not add all possible hotplug memory in
> advance, like before, to make it visible in sysfs for online/offline
> actions. Instead, before memory block can be set online, it has to be
> configured via a new interface in /sys/firmware/memory/memoryX/config,
> which makes s390 similar to others.  i.e. Adding of hotpluggable memory is
> controlled by the user instead of adding it at boottime.

Hi David,

Looking forward to your feedback to proceed further.

Thank you,
Sumanth
Re: [PATCH 0/4] Support dynamic (de)configuration of memory
Posted by David Hildenbrand 4 months ago
On 07.10.25 16:30, Sumanth Korikkar wrote:
> On Fri, Sep 26, 2025 at 03:15:23PM +0200, Sumanth Korikkar wrote:
>> Hi,
>>
>> Patchset provides a new interface for dynamic configuration and
>> deconfiguration of hotplug memory on s390, allowing with/without
>> memmap_on_memory support. It is a follow up on the discussion with David
>> when introducing memmap_on_memory support for s390 and support dynamic
>> (de)configuration of memory:
>> https://lore.kernel.org/all/ee492da8-74b4-4a97-8b24-73e07257f01d@redhat.com/
>> https://lore.kernel.org/all/20241202082732.3959803-1-sumanthk@linux.ibm.com/
>>
>> The original motivation for introducing memmap_on_memory on s390 was to
>> avoid using online memory to store struct pages metadata, particularly
>> for standby memory blocks. This became critical in cases where there was
>> an imbalance between standby and online memory, potentially leading to
>> boot failures due to insufficient memory for metadata allocation.
>>
>> To address this, memmap_on_memory was utilized on s390. However, in its
>> current form, it adds struct pages metadata at the start of each memory
>> block at the time of addition (only standby memory), and this
>> configuration is static. It cannot be changed at runtime  (When the user
>> needs continuous physical memory).
>>
>> Inorder to provide more flexibility to the user and overcome the above
>> limitation, add an option to dynamically configure and deconfigure
>> hotpluggable memory block with/without memmap_on_memory.
>>
>> With the new interface, s390 will not add all possible hotplug memory in
>> advance, like before, to make it visible in sysfs for online/offline
>> actions. Instead, before memory block can be set online, it has to be
>> configured via a new interface in /sys/firmware/memory/memoryX/config,
>> which makes s390 similar to others.  i.e. Adding of hotpluggable memory is
>> controlled by the user instead of adding it at boottime.
> 
> Hi David,
> 
> Looking forward to your feedback to proceed further.

Thanks for bumping it up in my inbox, will comment today :)

-- 
Cheers

David / dhildenb