[RFC PATCH 04/10] liveupdate: flb: allow getting FLB data in early boot

Pratyush Yadav posted 10 patches 2 months ago
Only 9 patches received!
[RFC PATCH 04/10] liveupdate: flb: allow getting FLB data in early boot
Posted by Pratyush Yadav 2 months ago
To support hugepage preservation using LUO, the hugetlb subsystem needs
to get liveupdate data when it allocates the hugepages to find out how
many pages are coming from live update. This data is preserved via LUO
FLB.

Since gigantic hugepage allocations happen before LUO (and much of the
rest of the system) is initialized, the usual
liveupdate_flb_get_incoming() can not work.

Add a read-only variant that fetches the FLB data but does not trigger
its retrieve or do any locking or reference counting. It is the caller's
responsibility to make sure there are no side effects of using this data
to the proper retrieve call that would happen later.

Refactor the logic to find the right FLB in the serialized data in a
helper that can be used from both luo_flb_retrieve_one() (called from
luo_flb_get_incoming()), and from luo_flb_get_incoming_early().

Signed-off-by: Pratyush Yadav <pratyush@kernel.org>
---
 include/linux/liveupdate.h  |  6 ++++
 kernel/liveupdate/luo_flb.c | 69 +++++++++++++++++++++++++++++--------
 2 files changed, 60 insertions(+), 15 deletions(-)

diff --git a/include/linux/liveupdate.h b/include/linux/liveupdate.h
index 78e8c529e4e7..39b429d2c62c 100644
--- a/include/linux/liveupdate.h
+++ b/include/linux/liveupdate.h
@@ -232,6 +232,7 @@ int liveupdate_unregister_flb(struct liveupdate_file_handler *fh,
 
 int liveupdate_flb_get_incoming(struct liveupdate_flb *flb, void **objp);
 int liveupdate_flb_get_outgoing(struct liveupdate_flb *flb, void **objp);
+int liveupdate_flb_incoming_early(struct liveupdate_flb *flb, u64 *datap);
 
 #else /* CONFIG_LIVEUPDATE */
 
@@ -283,5 +284,10 @@ static inline int liveupdate_flb_get_outgoing(struct liveupdate_flb *flb,
 	return -EOPNOTSUPP;
 }
 
+int liveupdate_flb_incoming_early(struct liveupdate_flb *flb, u64 *datap)
+{
+	return -EOPNOTSUPP;
+}
+
 #endif /* CONFIG_LIVEUPDATE */
 #endif /* _LINUX_LIVEUPDATE_H */
diff --git a/kernel/liveupdate/luo_flb.c b/kernel/liveupdate/luo_flb.c
index e80ac5b575ec..fb287734a88e 100644
--- a/kernel/liveupdate/luo_flb.c
+++ b/kernel/liveupdate/luo_flb.c
@@ -145,12 +145,25 @@ static void luo_flb_file_unpreserve_one(struct liveupdate_flb *flb)
 	}
 }
 
+static struct luo_flb_ser *luo_flb_find_ser(struct luo_flb_header *fh,
+					    const char *name)
+{
+	if (!fh->active)
+		return ERR_PTR(-ENODATA);
+
+	for (int i = 0; i < fh->header_ser->count; i++) {
+		if (!strcmp(fh->ser[i].name, name))
+			return &fh->ser[i];
+	}
+
+	return ERR_PTR(-ENOENT);
+}
+
 static int luo_flb_retrieve_one(struct liveupdate_flb *flb)
 {
 	struct luo_flb_private *private = luo_flb_get_private(flb);
-	struct luo_flb_header *fh = &luo_flb_global.incoming;
 	struct liveupdate_flb_op_args args = {0};
-	bool found = false;
+	struct luo_flb_ser *ser;
 	int err;
 
 	guard(mutex)(&private->incoming.lock);
@@ -158,20 +171,12 @@ static int luo_flb_retrieve_one(struct liveupdate_flb *flb)
 	if (private->incoming.obj)
 		return 0;
 
-	if (!fh->active)
-		return -ENODATA;
+	ser = luo_flb_find_ser(&luo_flb_global.incoming, flb->compatible);
+	if (IS_ERR(ser))
+		return PTR_ERR(ser);
 
-	for (int i = 0; i < fh->header_ser->count; i++) {
-		if (!strcmp(fh->ser[i].name, flb->compatible)) {
-			private->incoming.data = fh->ser[i].data;
-			private->incoming.count = fh->ser[i].count;
-			found = true;
-			break;
-		}
-	}
-
-	if (!found)
-		return -ENOENT;
+	private->incoming.data = ser->data;
+	private->incoming.count = ser->count;
 
 	args.flb = flb;
 	args.data = private->incoming.data;
@@ -188,6 +193,40 @@ static int luo_flb_retrieve_one(struct liveupdate_flb *flb)
 	return 0;
 }
 
+/**
+ * liveupdate_flb_incoming_early - Fetch FLB data in early boot.
+ * @flb:   The FLB definition
+ * @datap: Pointer to serialized state handle of the FLB
+ *
+ * This function is intended to be called during early boot, before the
+ * liveupdate subsystem is fully initialized. It must only be called after
+ * liveupdate_early_init().
+ *
+ * Directly returns the u64 handle to the serialized state of the FLB, and does
+ * not trigger its retrieve. A later fetch of the FLB will trigger the retrieve.
+ * Callers must make sure there are no side effects because of this.
+ *
+ * Return: 0 on success, -errno on failure. -ENODATA means no incoming FLB data,
+ * -ENOENT means specific FLB not found in incoming data, and -EOPNOTSUPP when
+ * live update is disabled or not early initialization not finished.
+ */
+int __init liveupdate_flb_incoming_early(struct liveupdate_flb *flb, u64 *datap)
+{
+	struct luo_flb_ser *ser;
+
+	if (!luo_early_initialized()) {
+		pr_warn("LUO FLB retrieved before LUO early init!\n");
+		return -EOPNOTSUPP;
+	}
+
+	ser = luo_flb_find_ser(&luo_flb_global.incoming, flb->compatible);
+	if (IS_ERR(ser))
+		return PTR_ERR(ser);
+
+	*datap = ser->data;
+	return 0;
+}
+
 static void luo_flb_file_finish_one(struct liveupdate_flb *flb)
 {
 	struct luo_flb_private *private = luo_flb_get_private(flb);
-- 
2.43.0
Re: [RFC PATCH 04/10] liveupdate: flb: allow getting FLB data in early boot
Posted by Pasha Tatashin 1 month, 2 weeks ago
On Sat, Dec 6, 2025 at 6:03 PM Pratyush Yadav <pratyush@kernel.org> wrote:
>
> To support hugepage preservation using LUO, the hugetlb subsystem needs
> to get liveupdate data when it allocates the hugepages to find out how
> many pages are coming from live update. This data is preserved via LUO
> FLB.
>
> Since gigantic hugepage allocations happen before LUO (and much of the
> rest of the system) is initialized, the usual
> liveupdate_flb_get_incoming() can not work.
>
> Add a read-only variant that fetches the FLB data but does not trigger
> its retrieve or do any locking or reference counting. It is the caller's
> responsibility to make sure there are no side effects of using this data
> to the proper retrieve call that would happen later.
>
> Refactor the logic to find the right FLB in the serialized data in a
> helper that can be used from both luo_flb_retrieve_one() (called from
> luo_flb_get_incoming()), and from luo_flb_get_incoming_early().
>
> Signed-off-by: Pratyush Yadav <pratyush@kernel.org>
> ---
>  include/linux/liveupdate.h  |  6 ++++
>  kernel/liveupdate/luo_flb.c | 69 +++++++++++++++++++++++++++++--------
>  2 files changed, 60 insertions(+), 15 deletions(-)
>
> diff --git a/include/linux/liveupdate.h b/include/linux/liveupdate.h
> index 78e8c529e4e7..39b429d2c62c 100644
> --- a/include/linux/liveupdate.h
> +++ b/include/linux/liveupdate.h
> @@ -232,6 +232,7 @@ int liveupdate_unregister_flb(struct liveupdate_file_handler *fh,
>
>  int liveupdate_flb_get_incoming(struct liveupdate_flb *flb, void **objp);
>  int liveupdate_flb_get_outgoing(struct liveupdate_flb *flb, void **objp);
> +int liveupdate_flb_incoming_early(struct liveupdate_flb *flb, u64 *datap);

Hi Pratyush,

[Follow-up from LPC discussion]

This patch is not needed, you can use liveupdate_flb_get_incoming()
directly in early boot. The main concern is that we take mutex in that
function, but that I think is safe. The might_sleep() has the proper
handling to be called early in boot, it has "system_state ==
SYSTEM_BOOTING" check to silence warning during boot.

Pasha
Re: [RFC PATCH 04/10] liveupdate: flb: allow getting FLB data in early boot
Posted by Pratyush Yadav 1 month, 2 weeks ago
On Thu, Dec 18 2025, Pasha Tatashin wrote:

> On Sat, Dec 6, 2025 at 6:03 PM Pratyush Yadav <pratyush@kernel.org> wrote:
>>
>> To support hugepage preservation using LUO, the hugetlb subsystem needs
>> to get liveupdate data when it allocates the hugepages to find out how
>> many pages are coming from live update. This data is preserved via LUO
>> FLB.
>>
>> Since gigantic hugepage allocations happen before LUO (and much of the
>> rest of the system) is initialized, the usual
>> liveupdate_flb_get_incoming() can not work.
>>
>> Add a read-only variant that fetches the FLB data but does not trigger
>> its retrieve or do any locking or reference counting. It is the caller's
>> responsibility to make sure there are no side effects of using this data
>> to the proper retrieve call that would happen later.
>>
>> Refactor the logic to find the right FLB in the serialized data in a
>> helper that can be used from both luo_flb_retrieve_one() (called from
>> luo_flb_get_incoming()), and from luo_flb_get_incoming_early().
>>
>> Signed-off-by: Pratyush Yadav <pratyush@kernel.org>
>> ---
>>  include/linux/liveupdate.h  |  6 ++++
>>  kernel/liveupdate/luo_flb.c | 69 +++++++++++++++++++++++++++++--------
>>  2 files changed, 60 insertions(+), 15 deletions(-)
>>
>> diff --git a/include/linux/liveupdate.h b/include/linux/liveupdate.h
>> index 78e8c529e4e7..39b429d2c62c 100644
>> --- a/include/linux/liveupdate.h
>> +++ b/include/linux/liveupdate.h
>> @@ -232,6 +232,7 @@ int liveupdate_unregister_flb(struct liveupdate_file_handler *fh,
>>
>>  int liveupdate_flb_get_incoming(struct liveupdate_flb *flb, void **objp);
>>  int liveupdate_flb_get_outgoing(struct liveupdate_flb *flb, void **objp);
>> +int liveupdate_flb_incoming_early(struct liveupdate_flb *flb, u64 *datap);
>
> Hi Pratyush,
>
> [Follow-up from LPC discussion]
>
> This patch is not needed, you can use liveupdate_flb_get_incoming()
> directly in early boot. The main concern is that we take mutex in that
> function, but that I think is safe. The might_sleep() has the proper
> handling to be called early in boot, it has "system_state ==
> SYSTEM_BOOTING" check to silence warning during boot.

Right. I will give it a try. For hugetlb, this works fine since it
doesn't really need to do much in FLB retrieve anyway, it just needs to
parse some data structures.

If other subsystems end up needing a two-part retrieve, one in early
boot and one later, then I think it would be a good idea to model that
properly instead of leaving it up to the subsystem to manage it.

Anyway, that isn't a real problem today so let's look at it when it does
show up.

-- 
Regards,
Pratyush Yadav
Re: [RFC PATCH 04/10] liveupdate: flb: allow getting FLB data in early boot
Posted by Pasha Tatashin 1 month, 2 weeks ago
> > [Follow-up from LPC discussion]
> >
> > This patch is not needed, you can use liveupdate_flb_get_incoming()
> > directly in early boot. The main concern is that we take mutex in that
> > function, but that I think is safe. The might_sleep() has the proper
> > handling to be called early in boot, it has "system_state ==
> > SYSTEM_BOOTING" check to silence warning during boot.
>
> Right. I will give it a try. For hugetlb, this works fine since it
> doesn't really need to do much in FLB retrieve anyway, it just needs to
> parse some data structures.
>
> If other subsystems end up needing a two-part retrieve, one in early
> boot and one later, then I think it would be a good idea to model that
> properly instead of leaving it up to the subsystem to manage it.
>
> Anyway, that isn't a real problem today so let's look at it when it does
> show up.

FLB has exactly one .retrieve() lifecycle event. Once called, the data
is considered fully available and cached in private->incoming.obj.

If a subsystem has a requirement where it needs a specific state
available very early and other state available much later, the clean
solution is simply to register two separate FLBs.

Pasha
Re: [RFC PATCH 04/10] liveupdate: flb: allow getting FLB data in early boot
Posted by Pratyush Yadav 1 month, 2 weeks ago
On Sat, Dec 20 2025, Pasha Tatashin wrote:

>> > [Follow-up from LPC discussion]
>> >
>> > This patch is not needed, you can use liveupdate_flb_get_incoming()
>> > directly in early boot. The main concern is that we take mutex in that
>> > function, but that I think is safe. The might_sleep() has the proper
>> > handling to be called early in boot, it has "system_state ==
>> > SYSTEM_BOOTING" check to silence warning during boot.
>>
>> Right. I will give it a try. For hugetlb, this works fine since it
>> doesn't really need to do much in FLB retrieve anyway, it just needs to
>> parse some data structures.
>>
>> If other subsystems end up needing a two-part retrieve, one in early
>> boot and one later, then I think it would be a good idea to model that
>> properly instead of leaving it up to the subsystem to manage it.
>>
>> Anyway, that isn't a real problem today so let's look at it when it does
>> show up.
>
> FLB has exactly one .retrieve() lifecycle event. Once called, the data
> is considered fully available and cached in private->incoming.obj.
>
> If a subsystem has a requirement where it needs a specific state
> available very early and other state available much later, the clean
> solution is simply to register two separate FLBs.

Hmm, that can work too. Anyway, let's figure that out when there is a
real use case. For now, the current FLB design works fine.

-- 
Regards,
Pratyush Yadav