[PATCH 1/5] brd: fix oops if write concurrent with discard

Yu Kuai posted 5 patches 3 weeks, 4 days ago
There is a newer version of this series
[PATCH 1/5] brd: fix oops if write concurrent with discard
Posted by Yu Kuai 3 weeks, 4 days ago
From: Yu Kuai <yukuai3@huawei.com>

User can issue write and discard concurrently, causing following BUG_ON:
cpu0:
brd_submit_bio
 brd_do_bve
  copy_to_brd_setup
   brd_insert_page
    xa_lock
    __xa_insert
    xa_unlock
				cpu1
				brd_submit_bio
				 brd_do_discard
				  xa_lock
				  page = __xa_erase
				  __free_page
				  xa_unlock
  copy_to_brd
   brd_lookup_page
    page = xa_load
    BUG_ON(!page)

Fix this problem by skipping the write, and user will get zero data
later if the page is not here.

Also fix following checkpatch warnings:
WARNING: Deprecated use of 'kmap_atomic', prefer 'kmap_local_page' instead
WARNING: Deprecated use of 'kunmap_atomic', prefer 'kunmap_local' instead

Fixes: 9ead7efc6f3f ("brd: implement discard support")
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 drivers/block/brd.c | 32 ++++++++++++++++----------------
 1 file changed, 16 insertions(+), 16 deletions(-)

diff --git a/drivers/block/brd.c b/drivers/block/brd.c
index 292f127cae0a..a6e4f005cb76 100644
--- a/drivers/block/brd.c
+++ b/drivers/block/brd.c
@@ -133,22 +133,22 @@ static void copy_to_brd(struct brd_device *brd, const void *src,
 
 	copy = min_t(size_t, n, PAGE_SIZE - offset);
 	page = brd_lookup_page(brd, sector);
-	BUG_ON(!page);
-
-	dst = kmap_atomic(page);
-	memcpy(dst + offset, src, copy);
-	kunmap_atomic(dst);
+	if (page) {
+		dst = kmap_local_page(page);
+		memcpy(dst + offset, src, copy);
+		kunmap_local(dst);
+	}
 
 	if (copy < n) {
 		src += copy;
 		sector += copy >> SECTOR_SHIFT;
 		copy = n - copy;
 		page = brd_lookup_page(brd, sector);
-		BUG_ON(!page);
-
-		dst = kmap_atomic(page);
-		memcpy(dst, src, copy);
-		kunmap_atomic(dst);
+		if (page) {
+			dst = kmap_local_page(page);
+			memcpy(dst, src, copy);
+			kunmap_local(dst);
+		}
 	}
 }
 
@@ -166,9 +166,9 @@ static void copy_from_brd(void *dst, struct brd_device *brd,
 	copy = min_t(size_t, n, PAGE_SIZE - offset);
 	page = brd_lookup_page(brd, sector);
 	if (page) {
-		src = kmap_atomic(page);
+		src = kmap_local_page(page);
 		memcpy(dst, src + offset, copy);
-		kunmap_atomic(src);
+		kunmap_local(src);
 	} else
 		memset(dst, 0, copy);
 
@@ -178,9 +178,9 @@ static void copy_from_brd(void *dst, struct brd_device *brd,
 		copy = n - copy;
 		page = brd_lookup_page(brd, sector);
 		if (page) {
-			src = kmap_atomic(page);
+			src = kmap_local_page(page);
 			memcpy(dst, src, copy);
-			kunmap_atomic(src);
+			kunmap_local(src);
 		} else
 			memset(dst, 0, copy);
 	}
@@ -208,7 +208,7 @@ static int brd_do_bvec(struct brd_device *brd, struct page *page,
 			goto out;
 	}
 
-	mem = kmap_atomic(page);
+	mem = kmap_local_page(page);
 	if (!op_is_write(opf)) {
 		copy_from_brd(mem + off, brd, sector, len);
 		flush_dcache_page(page);
@@ -216,7 +216,7 @@ static int brd_do_bvec(struct brd_device *brd, struct page *page,
 		flush_dcache_page(page);
 		copy_to_brd(brd, mem + off, sector, len);
 	}
-	kunmap_atomic(mem);
+	kunmap_local(mem);
 
 out:
 	return err;
-- 
2.39.2
Re: [PATCH 1/5] brd: fix oops if write concurrent with discard
Posted by Christoph Hellwig 3 weeks, 2 days ago
On Fri, Apr 18, 2025 at 05:38:22PM +0800, Yu Kuai wrote:
>  	copy = min_t(size_t, n, PAGE_SIZE - offset);
>  	page = brd_lookup_page(brd, sector);
> -	BUG_ON(!page);
> -
> -	dst = kmap_atomic(page);
> -	memcpy(dst + offset, src, copy);
> -	kunmap_atomic(dst);
> +	if (page) {
> +		dst = kmap_local_page(page);
> +		memcpy(dst + offset, src, copy);
> +		kunmap_local(dst);
> +	}

I don't see how this can fix any race, it just narrows down the
race window.  To fix the race for real, copy_to_brd_setup needs
to return a page and keep a reference to it for the caller.  The
caller then only needs to operate on a single page.

We'll also need to do something similar for brd_lookup_page to
ensure the page reference doesn't go away after the xarray lookup
but before using the page.

> Also fix following checkpatch warnings:
> WARNING: Deprecated use of 'kmap_atomic', prefer 'kmap_local_page' instead
> WARNING: Deprecated use of 'kunmap_atomic', prefer 'kunmap_local' instead

This really should be using bvec_kmap_local.  I actually have an
entire series to fix that and clean up some of the surroundings that I
need to send out.  Let me dust that off because it might help with the
above mentioned fixes as well.