[PATCH v3 2/2] liveupdate: initialize incoming FLB state before finish

Leo Timmins posted 2 patches 1 week ago
[PATCH v3 2/2] liveupdate: initialize incoming FLB state before finish
Posted by Leo Timmins 1 week ago
luo_flb_file_finish_one() decremented incoming.count before making sure
that the incoming FLB state had been materialized. If no earlier incoming
retrieval had populated that state, the first decrement ran from zero and
skipped the last-user finish path.

Initialize the incoming FLB state before the first decrement so finish
uses the serialized refcount instead of an uninitialized value.

Fixes: cab056f2aae7 ("liveupdate: luo_flb: introduce File-Lifecycle-Bound global state")
Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Signed-off-by: Leo Timmins <leotimmins1974@gmail.com>
---
 kernel/liveupdate/luo_flb.c | 19 ++++++++++++++++++-
 1 file changed, 18 insertions(+), 1 deletion(-)

diff --git a/kernel/liveupdate/luo_flb.c b/kernel/liveupdate/luo_flb.c
index f52e8114837e..855af655b09b 100644
--- a/kernel/liveupdate/luo_flb.c
+++ b/kernel/liveupdate/luo_flb.c
@@ -192,10 +192,27 @@ static int luo_flb_retrieve_one(struct liveupdate_flb *flb)
 static void luo_flb_file_finish_one(struct liveupdate_flb *flb)
 {
 	struct luo_flb_private *private = luo_flb_get_private(flb);
+	bool needs_retrieve = false;
 	u64 count;
 
-	scoped_guard(mutex, &private->incoming.lock)
+	scoped_guard(mutex, &private->incoming.lock) {
+		if (!private->incoming.count && !private->incoming.finished)
+			needs_retrieve = true;
+	}
+
+	if (needs_retrieve) {
+		int err = luo_flb_retrieve_one(flb);
+
+		if (err) {
+			pr_warn("Failed to retrieve FLB '%s' during finish: %pe\n",
+				flb->compatible, ERR_PTR(err));
+			return;
+		}
+	}
+
+	scoped_guard(mutex, &private->incoming.lock) {
 		count = --private->incoming.count;
+	}
 
 	if (!count) {
 		struct liveupdate_flb_op_args args = {0};
-- 
2.53.0
Re: [PATCH v3 2/2] liveupdate: initialize incoming FLB state before finish
Posted by Pratyush Yadav 10 hours ago
On Thu, Mar 26 2026, Leo Timmins wrote:

> luo_flb_file_finish_one() decremented incoming.count before making sure
> that the incoming FLB state had been materialized. If no earlier incoming
> retrieval had populated that state, the first decrement ran from zero and
> skipped the last-user finish path.
>
> Initialize the incoming FLB state before the first decrement so finish
> uses the serialized refcount instead of an uninitialized value.

This commit message makes it very hard to understand what the problem it
fixes. It took me 20 minutes of reading this patch and looking at the
FLB code to figure out what is going on. Here is what I'd suggest:

    The state of an incoming FLB object is initialized when it is first
    used. The initialization is done via luo_flb_retrieve_one(), which looks
    at all the incoming FLBs, matches the FLB to its serialized entry, and
    initializes the incoming data and count.

    luo_flb_file_finish_one() is called when finish is called for a file
    registered with this FLB. If no file handler has used the FLB by this
    point, the count stays un-initialized at 0. luo_flb_file_finish_one()
    then decrements this un-initialized count, leading to an underflow. This
    results in the FLB finish never being called since the count has
    underflowed to a very large value.

    Fix this by making sure the FLB is retrieved before using its count.

>
> Fixes: cab056f2aae7 ("liveupdate: luo_flb: introduce File-Lifecycle-Bound global state")
> Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com>
> Signed-off-by: Leo Timmins <leotimmins1974@gmail.com>
> ---
>  kernel/liveupdate/luo_flb.c | 19 ++++++++++++++++++-
>  1 file changed, 18 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/liveupdate/luo_flb.c b/kernel/liveupdate/luo_flb.c
> index f52e8114837e..855af655b09b 100644
> --- a/kernel/liveupdate/luo_flb.c
> +++ b/kernel/liveupdate/luo_flb.c
> @@ -192,10 +192,27 @@ static int luo_flb_retrieve_one(struct liveupdate_flb *flb)
>  static void luo_flb_file_finish_one(struct liveupdate_flb *flb)
>  {
>  	struct luo_flb_private *private = luo_flb_get_private(flb);
> +	bool needs_retrieve = false;
>  	u64 count;
>  
> -	scoped_guard(mutex, &private->incoming.lock)
> +	scoped_guard(mutex, &private->incoming.lock) {
> +		if (!private->incoming.count && !private->incoming.finished)
> +			needs_retrieve = true;
> +	}
> +
> +	if (needs_retrieve) {
> +		int err = luo_flb_retrieve_one(flb);
> +
> +		if (err) {
> +			pr_warn("Failed to retrieve FLB '%s' during finish: %pe\n",
> +				flb->compatible, ERR_PTR(err));
> +			return;
> +		}
> +	}
> +
> +	scoped_guard(mutex, &private->incoming.lock) {
>  		count = --private->incoming.count;
> +	}

This looks way too overcomplicated. We just need to move the retrieve
call before we check the count.

Pasha, side note: I think FLB suffers from the same problem I fixed for
LUO files with f85b1c6af5bc ("liveupdate: luo_file: remember retrieve()
status"). If retrieve fails, it doesn't remember the error code and will
retry retrieve() on next usage, causing all sorts of problems.

Pasha, another side note: I think incoming.private should be initialized
when the FLB is registered, not when it is first used. This will make
this simpler overall. This would need liveupdate_register_flb() to not
call kzalloc(), but that can be done I think.

Anyway, for this problem, how about the patch below (only
compile-tested) addressing my comments. The diff looks bigger but the
end result is a lot cleaner IMO. In effect it just moves the retrieve
call outside the if (!count). Everything else is just cleaning up the
locking situation.

--- 8< ---
From 775565e72d9b851839d37088549f0fc247cac2e1 Mon Sep 17 00:00:00 2001
From: "Pratyush Yadav (Google)" <pratyush@kernel.org>
Date: Thu, 2 Apr 2026 13:22:05 +0000
Subject: [PATCH] liveupdate: initialize incoming FLB state before finish

The state of an incoming FLB object is initialized when it is first
used. The initialization is done via luo_flb_retrieve_one(), which looks
at all the incoming FLBs, matches the FLB to its serialized entry, and
initializes the incoming data and count.

luo_flb_file_finish_one() is called when finish is called for a file
registered with this FLB. If no file handler has used the FLB by this
point, the count stays un-initialized at 0. luo_flb_file_finish_one()
then decrements this un-initialized count, leading to an underflow. This
results in the FLB finish never being called since the count has
underflowed to a very large value.

Fix this by making sure the FLB is retrieved before using its count.

Fixes: cab056f2aae7 ("liveupdate: luo_flb: introduce File-Lifecycle-Bound global state")
Suggested-by: Leo Timmins <leotimmins1974@gmail.com>
Signed-off-by: Pratyush Yadav (Google) <pratyush@kernel.org>
---
 kernel/liveupdate/luo_flb.c | 32 +++++++++++++++-----------------
 1 file changed, 15 insertions(+), 17 deletions(-)

diff --git a/kernel/liveupdate/luo_flb.c b/kernel/liveupdate/luo_flb.c
index f52e8114837e..be141620751e 100644
--- a/kernel/liveupdate/luo_flb.c
+++ b/kernel/liveupdate/luo_flb.c
@@ -194,28 +194,26 @@ static void luo_flb_file_finish_one(struct liveupdate_flb *flb)
 	struct luo_flb_private *private = luo_flb_get_private(flb);
 	u64 count;
 
-	scoped_guard(mutex, &private->incoming.lock)
-		count = --private->incoming.count;
+	if (!private->incoming.retrieved) {
+		int err = luo_flb_retrieve_one(flb);
 
+		if (WARN_ON(err))
+			return;
+	}
+
+	guard(mutex)(&private->incoming.lock);
+
+	count = --private->incoming.count;
 	if (!count) {
 		struct liveupdate_flb_op_args args = {0};
 
-		if (!private->incoming.retrieved) {
-			int err = luo_flb_retrieve_one(flb);
+		args.flb = flb;
+		args.obj = private->incoming.obj;
+		flb->ops->finish(&args);
 
-			if (WARN_ON(err))
-				return;
-		}
-
-		scoped_guard(mutex, &private->incoming.lock) {
-			args.flb = flb;
-			args.obj = private->incoming.obj;
-			flb->ops->finish(&args);
-
-			private->incoming.data = 0;
-			private->incoming.obj = NULL;
-			private->incoming.finished = true;
-		}
+		private->incoming.data = 0;
+		private->incoming.obj = NULL;
+		private->incoming.finished = true;
 	}
 }
 
-- 
Regards,
Pratyush Yadav
Re: [PATCH v3 2/2] liveupdate: initialize incoming FLB state before finish
Posted by Andrew Morton 5 hours ago
On Thu, 02 Apr 2026 13:28:33 +0000 Pratyush Yadav <pratyush@kernel.org> wrote:

> The state of an incoming FLB object is initialized when it is first
> used. The initialization is done via luo_flb_retrieve_one(), which looks
> at all the incoming FLBs, matches the FLB to its serialized entry, and
> initializes the incoming data and count.
> 
> luo_flb_file_finish_one() is called when finish is called for a file
> registered with this FLB. If no file handler has used the FLB by this
> point, the count stays un-initialized at 0. luo_flb_file_finish_one()
> then decrements this un-initialized count, leading to an underflow. This
> results in the FLB finish never being called since the count has
> underflowed to a very large value.
> 
> Fix this by making sure the FLB is retrieved before using its count.

I like that the above tells people what the actual bug is!

I still have both Leo's patches in mm.git, in wait-and-see mode.  What
to do here?  Should I upstream [1/2] and drop [2/2]?  Drop both and
revisit after -rc1?

Also, did we consider cc:stable for these two?  Perhaps add the
cc:stable if we decide to attend to this after -rc1?
Re: [PATCH v3 2/2] liveupdate: initialize incoming FLB state before finish
Posted by Pasha Tatashin 1 week ago
On Thu, Mar 26, 2026 at 12:26 AM Leo Timmins <leotimmins1974@gmail.com> wrote:
>
> luo_flb_file_finish_one() decremented incoming.count before making sure
> that the incoming FLB state had been materialized. If no earlier incoming
> retrieval had populated that state, the first decrement ran from zero and
> skipped the last-user finish path.
>
> Initialize the incoming FLB state before the first decrement so finish
> uses the serialized refcount instead of an uninitialized value.
>
> Fixes: cab056f2aae7 ("liveupdate: luo_flb: introduce File-Lifecycle-Bound global state")
> Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com>
> Signed-off-by: Leo Timmins <leotimmins1974@gmail.com>
> ---
>  kernel/liveupdate/luo_flb.c | 19 ++++++++++++++++++-
>  1 file changed, 18 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/liveupdate/luo_flb.c b/kernel/liveupdate/luo_flb.c
> index f52e8114837e..855af655b09b 100644
> --- a/kernel/liveupdate/luo_flb.c
> +++ b/kernel/liveupdate/luo_flb.c
> @@ -192,10 +192,27 @@ static int luo_flb_retrieve_one(struct liveupdate_flb *flb)
>  static void luo_flb_file_finish_one(struct liveupdate_flb *flb)
>  {
>         struct luo_flb_private *private = luo_flb_get_private(flb);
> +       bool needs_retrieve = false;
>         u64 count;
>
> -       scoped_guard(mutex, &private->incoming.lock)
> +       scoped_guard(mutex, &private->incoming.lock) {
> +               if (!private->incoming.count && !private->incoming.finished)
> +                       needs_retrieve = true;
> +       }
> +
> +       if (needs_retrieve) {
> +               int err = luo_flb_retrieve_one(flb);
> +
> +               if (err) {
> +                       pr_warn("Failed to retrieve FLB '%s' during finish: %pe\n",
> +                               flb->compatible, ERR_PTR(err));
> +                       return;
> +               }
> +       }
> +
> +       scoped_guard(mutex, &private->incoming.lock) {
>                 count = --private->incoming.count;
> +       }

The braces are still added

>
>         if (!count) {
>                 struct liveupdate_flb_op_args args = {0};
> --
> 2.53.0
>