From nobody Sat Dec 27 17:02:36 2025 Received: from mail-yw1-f201.google.com (mail-yw1-f201.google.com [209.85.128.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D8E4D7D61A for ; Mon, 18 Dec 2023 12:40:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--glider.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="SZ/BH2uN" Received: by mail-yw1-f201.google.com with SMTP id 00721157ae682-5e7547e98f1so310987b3.1 for ; Mon, 18 Dec 2023 04:40:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1702903241; x=1703508041; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=T8Y1vpDLXBoOVMPkC57pHh/dMlB5nfNRh+2cjdcX+Qc=; b=SZ/BH2uNlHZPdZRmIvzMRqiHSoL+vFR7XWaghjOckIg/jHhz8742MINK0cbFgfk1YI ZUZaV8CpbtsnCBhxhER9vjXO8WLyt2BN6Lqe/IqmbZRgx2iYzI1IA5XH4C5MS/eN898K VNgoHyjZoXoaCpdnMRq8xPWF9CtebFn3Ju4njt0CnbrVpFB8auO8cXCq/ZEbC4uwpLMw jiBShTUBtk6ScOkpwdeFpM0i/54Us7I5T4cHmCbKvvM8et/r3GCtPppQF0w/dUJxRaPM GqNWBG4E8+tGUImErE+sWvb5ITk3tYf7L+k8Yb0e02K28a6YfzQIMfcwl16DsN/YQXHE Guwg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1702903241; x=1703508041; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=T8Y1vpDLXBoOVMPkC57pHh/dMlB5nfNRh+2cjdcX+Qc=; b=AD2eLz/H7t8crEgdBGxqboKfQXoolFpPlX4U/EcStmysJZ5m7KIEhrfRfg6W0GAnoE GW0VuSx1rysZ9hHXK1V7W8JlRXdx+vjf7/x48eskolspYysG48kf9nk2+HkHKVTf1rSY DUtd0j8p35h77TeztG9CTB0ciew3rHoJgGPCmNHI1+GHFvEJOXPOH1HC7BWlCa5eDc55 IxEiIIS5q5VWypmE8Ei/eD8wjrxaO+LtvAfeuaAwSHlCegcCSMBWS5K2rUpKaVZwss0s tjatFVEUI04yt2DDhCN3b1ozgbI72cceRsOpEtguXdeQEo6nqgGtSkrqvZb/rR7T79fR A8Lg== X-Gm-Message-State: AOJu0YylAe34KPoJQUpSm8zOGMmv4hPZcyIEu3YAICkaihcxn8eptjUy xhaRsp8izMaNxVi4Qb8opnnveOihydc= X-Google-Smtp-Source: AGHT+IE7ihbMpk13JBzv7e+cU/i+BqwecvBnY7IH9/0tDL6jCpWfN6m7AIIo1GGAyzhgiLbHWu4SYj8h6qw= X-Received: from glider.muc.corp.google.com ([2a00:79e0:9c:201:9a3f:c8e7:1395:806e]) (user=glider job=sendgmr) by 2002:a05:6902:1367:b0:dbd:30b0:828e with SMTP id bt7-20020a056902136700b00dbd30b0828emr605226ybb.1.1702903240833; Mon, 18 Dec 2023 04:40:40 -0800 (PST) Date: Mon, 18 Dec 2023 13:40:27 +0100 In-Reply-To: <20231218124033.551770-1-glider@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20231218124033.551770-1-glider@google.com> X-Mailer: git-send-email 2.43.0.472.g3155946c3a-goog Message-ID: <20231218124033.551770-2-glider@google.com> Subject: [PATCH v11-mte 1/7] lib/bitmap: add bitmap_{read,write}() From: Alexander Potapenko To: glider@google.com, catalin.marinas@arm.com, will@kernel.org, pcc@google.com, andreyknvl@gmail.com, andriy.shevchenko@linux.intel.com, aleksander.lobakin@intel.com, linux@rasmusvillemoes.dk, yury.norov@gmail.com, alexandru.elisei@arm.com Cc: linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, eugenis@google.com, syednwaris@gmail.com, william.gray@linaro.org, Arnd Bergmann Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Syed Nayyar Waris The two new functions allow reading/writing values of length up to BITS_PER_LONG bits at arbitrary position in the bitmap. The code was taken from "bitops: Introduce the for_each_set_clump macro" by Syed Nayyar Waris with a number of changes and simplifications: - instead of using roundup(), which adds an unnecessary dependency on , we calculate space as BITS_PER_LONG-offset; - indentation is reduced by not using else-clauses (suggested by checkpatch for bitmap_get_value()); - bitmap_get_value()/bitmap_set_value() are renamed to bitmap_read() and bitmap_write(); - some redundant computations are omitted. Cc: Arnd Bergmann Signed-off-by: Syed Nayyar Waris Signed-off-by: William Breathitt Gray Link: https://lore.kernel.org/lkml/fe12eedf3666f4af5138de0e70b67a07c7f40338= .1592224129.git.syednwaris@gmail.com/ Suggested-by: Yury Norov Co-developed-by: Alexander Potapenko Signed-off-by: Alexander Potapenko Reviewed-by: Andy Shevchenko Acked-by: Yury Norov --- v11-mte: - add Yury's Acked-by: v10-mte: - send this patch together with the "Implement MTE tag compression for swapped pages" Revisions v8-v12 of bitmap patches were reviewed separately from the "Implement MTE tag compression for swapped pages" series (https://lore.kernel.org/lkml/20231109151106.2385155-1-glider@google.com/) This patch was previously called "lib/bitmap: add bitmap_{set,get}_value()" (https://lore.kernel.org/lkml/20230720173956.3674987-2-glider@google.com/) v11: - rearrange whitespace as requested by Andy Shevchenko, add Reviewed-by:, update a comment v10: - update comments as requested by Andy Shevchenko v8: - as suggested by Andy Shevchenko, handle reads/writes of more than BITS_PER_LONG bits, add a note for 32-bit systems v7: - Address comments by Yury Norov, Andy Shevchenko, Rasmus Villemoes: - update code comments; - get rid of GENMASK(); - s/assign_bit/__assign_bit; - more vertical whitespace for better readability; - more compact code for bitmap_write() (now for real) v6: - As suggested by Yury Norov, do not require bitmap_read(..., 0) to return 0. v5: - Address comments by Yury Norov: - updated code comments and patch title/description - replace GENMASK(nbits - 1, 0) with BITMAP_LAST_WORD_MASK(nbits) - more compact bitmap_write() implementation v4: - Address comments by Andy Shevchenko and Yury Norov: - prevent passing values >=3D 64 to GENMASK() - fix commit authorship - change comments - check for unlikely(nbits=3D=3D0) - drop unnecessary const declarations - fix kernel-doc comments - rename bitmap_{get,set}_value() to bitmap_{read,write}() --- include/linux/bitmap.h | 77 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 77 insertions(+) diff --git a/include/linux/bitmap.h b/include/linux/bitmap.h index 99451431e4d65..7ca0379be8c13 100644 --- a/include/linux/bitmap.h +++ b/include/linux/bitmap.h @@ -79,6 +79,10 @@ struct device; * bitmap_to_arr64(buf, src, nbits) Copy nbits from buf to u64= [] dst * bitmap_get_value8(map, start) Get 8bit value from map at= start * bitmap_set_value8(map, value, start) Set 8bit value to map at s= tart + * bitmap_read(map, start, nbits) Read an nbits-sized value = from + * map at start + * bitmap_write(map, value, start, nbits) Write an nbits-sized value= to + * map at start * * Note, bitmap_zero() and bitmap_fill() operate over the region of * unsigned longs, that is, bits behind bitmap till the unsigned long @@ -636,6 +640,79 @@ static inline void bitmap_set_value8(unsigned long *ma= p, unsigned long value, map[index] |=3D value << offset; } =20 +/** + * bitmap_read - read a value of n-bits from the memory region + * @map: address to the bitmap memory region + * @start: bit offset of the n-bit value + * @nbits: size of value in bits, nonzero, up to BITS_PER_LONG + * + * Returns: value of @nbits bits located at the @start bit offset within t= he + * @map memory region. For @nbits =3D 0 and @nbits > BITS_PER_LONG the ret= urn + * value is undefined. + */ +static inline unsigned long bitmap_read(const unsigned long *map, + unsigned long start, + unsigned long nbits) +{ + size_t index =3D BIT_WORD(start); + unsigned long offset =3D start % BITS_PER_LONG; + unsigned long space =3D BITS_PER_LONG - offset; + unsigned long value_low, value_high; + + if (unlikely(!nbits || nbits > BITS_PER_LONG)) + return 0; + + if (space >=3D nbits) + return (map[index] >> offset) & BITMAP_LAST_WORD_MASK(nbits); + + value_low =3D map[index] & BITMAP_FIRST_WORD_MASK(start); + value_high =3D map[index + 1] & BITMAP_LAST_WORD_MASK(start + nbits); + return (value_low >> offset) | (value_high << space); +} + +/** + * bitmap_write - write n-bit value within a memory region + * @map: address to the bitmap memory region + * @value: value to write, clamped to nbits + * @start: bit offset of the n-bit value + * @nbits: size of value in bits, nonzero, up to BITS_PER_LONG. + * + * bitmap_write() behaves as-if implemented as @nbits calls of __assign_bi= t(), + * i.e. bits beyond @nbits are ignored: + * + * for (bit =3D 0; bit < nbits; bit++) + * __assign_bit(start + bit, bitmap, val & BIT(bit)); + * + * For @nbits =3D=3D 0 and @nbits > BITS_PER_LONG no writes are performed. + */ +static inline void bitmap_write(unsigned long *map, unsigned long value, + unsigned long start, unsigned long nbits) +{ + size_t index; + unsigned long offset; + unsigned long space; + unsigned long mask; + bool fit; + + if (unlikely(!nbits || nbits > BITS_PER_LONG)) + return; + + mask =3D BITMAP_LAST_WORD_MASK(nbits); + value &=3D mask; + offset =3D start % BITS_PER_LONG; + space =3D BITS_PER_LONG - offset; + fit =3D space >=3D nbits; + index =3D BIT_WORD(start); + + map[index] &=3D (fit ? (~(mask << offset)) : ~BITMAP_FIRST_WORD_MASK(star= t)); + map[index] |=3D value << offset; + if (fit) + return; + + map[index + 1] &=3D BITMAP_FIRST_WORD_MASK(start + nbits); + map[index + 1] |=3D (value >> space); +} + #endif /* __ASSEMBLY__ */ =20 #endif /* __LINUX_BITMAP_H */ --=20 2.43.0.472.g3155946c3a-goog From nobody Sat Dec 27 17:02:36 2025 Received: from mail-yb1-f202.google.com (mail-yb1-f202.google.com [209.85.219.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AD3D61DDCA for ; Mon, 18 Dec 2023 12:40:44 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--glider.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="orq+uUl0" Received: by mail-yb1-f202.google.com with SMTP id 3f1490d57ef6-dbce98f7d44so2990660276.2 for ; Mon, 18 Dec 2023 04:40:44 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1702903243; x=1703508043; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=+s7R5buINA1l+45iU7O1tB1KNs5J4gGn/liaeDYSEis=; b=orq+uUl0cqCSdiF2IJzqA3h4Vxq8Es30qtmr4C+/BwxXFcVFhxxylNmlYCYT8MGtgV Yiep/uj0DLyl99EEdnWcfT6Gjg+xMiZk6lJD8WFugkwMsdgiyQRDAP4SaLTnc9Tl/DL0 n60XVqlfJyL1OBZCGgBxMQ8we+c2DBb8Pf+Ld5fINS5WUzeIFiwlK/ygoE8/IOUtQpSF X70H76pbk2+kCr/HtWN5ahZm1hmTfHy8vuko1LKwRLE6EDACRv85nsgwg82krzB9hRAH 0/3KTCJtAfdiL0hjjXYc2KqBO7EE6M0y3Uhcd95h+9tZk9oXiBdDUH6SK89sgk+MwtDb NFcw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1702903243; x=1703508043; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=+s7R5buINA1l+45iU7O1tB1KNs5J4gGn/liaeDYSEis=; b=V1O3UJE8xEL+xogDmrPPCmtWygjimxif0wOsCUSsDqQADzHrZRreqLt87e4O/R2Hg5 6JUiQ8VeEFOK76TxUkA6e3hU3c361o2JDr3gc0aUmcwb0Hr0VbwXGELLgMoIgdbpHrUL eGf4NTESmq6xYSPYiP8f9nVg/TbF4XahKmZsm27J7TBi0wn4gWDDwy7HNJd63oK3jDF1 GYLymNXpRKMm+02TezCX6WSI8zvP0dnkZC2JOpCS51esF+ZsW9BdgVFq2L+B580MpUk/ 2aRxaNCvwA3wLn0IIx7t5hAxF8+mjqSwXy0C6TWBuMgAAaidNyEETef4DwjWGQDSm7Wk tNcA== X-Gm-Message-State: AOJu0Yxzo66naLsTu0cZi6bFY0yHDgx7liKhQ0eyblvkNcCd2TE664KN 4kDAwDZxIIT+ZG4X/SrPi78PBTyrREc= X-Google-Smtp-Source: AGHT+IGb4xCln8IqSdTzAT71MlgqNnXOUpBRk3Q118PoT1YbYgqVVxGub1tkoODygZr5QCyZ6dbFESKf3b8= X-Received: from glider.muc.corp.google.com ([2a00:79e0:9c:201:9a3f:c8e7:1395:806e]) (user=glider job=sendgmr) by 2002:a25:814c:0:b0:dbd:460e:1265 with SMTP id j12-20020a25814c000000b00dbd460e1265mr74851ybm.5.1702903243657; Mon, 18 Dec 2023 04:40:43 -0800 (PST) Date: Mon, 18 Dec 2023 13:40:28 +0100 In-Reply-To: <20231218124033.551770-1-glider@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20231218124033.551770-1-glider@google.com> X-Mailer: git-send-email 2.43.0.472.g3155946c3a-goog Message-ID: <20231218124033.551770-3-glider@google.com> Subject: [PATCH v11-mte 2/7] lib/test_bitmap: add tests for bitmap_{read,write}() From: Alexander Potapenko To: glider@google.com, catalin.marinas@arm.com, will@kernel.org, pcc@google.com, andreyknvl@gmail.com, andriy.shevchenko@linux.intel.com, aleksander.lobakin@intel.com, linux@rasmusvillemoes.dk, yury.norov@gmail.com, alexandru.elisei@arm.com Cc: linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, eugenis@google.com, syednwaris@gmail.com, william.gray@linaro.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add basic tests ensuring that values can be added at arbitrary positions of the bitmap, including those spanning into the adjacent unsigned longs. Two new performance tests, test_bitmap_read_perf() and test_bitmap_write_perf(), can be used to assess future performance improvements of bitmap_read() and bitmap_write(): [ 0.431119][ T1] test_bitmap: Time spent in test_bitmap_read_perf: 61= 5253 [ 0.433197][ T1] test_bitmap: Time spent in test_bitmap_write_perf: 9= 16313 (numbers from a Intel(R) Xeon(R) Gold 6154 CPU @ 3.00GHz machine running QEMU). Signed-off-by: Alexander Potapenko Reviewed-by: Andy Shevchenko Acked-by: Yury Norov --- v11-mte: - add Yury's Acked-by: v10-mte: - send this patch together with the "Implement MTE tag compression for swapped pages" Revisions v8-v12 of bitmap patches were reviewed separately from the "Implement MTE tag compression for swapped pages" series (https://lore.kernel.org/lkml/20231109151106.2385155-1-glider@google.com/) This patch was previously called "lib/test_bitmap: add tests for bitmap_{set,get}_value()" (https://lore.kernel.org/lkml/20230720173956.3674987-3-glider@google.com/) and "lib/test_bitmap: add tests for bitmap_{set,get}_value_unaligned" (https://lore.kernel.org/lkml/20230713125706.2884502-3-glider@google.com/) v12: - as suggested by Alexander Lobakin, replace expect_eq_uint() with expect_eq_ulong() and a cast v9: - use WRITE_ONCE() to prevent optimizations in test_bitmap_read_perf() - update patch description v8: - as requested by Andy Shevchenko, add tests for reading/writing sizes > BITS_PER_LONG v7: - as requested by Yury Norov, add performance tests for bitmap_read() and bitmap_write() v6: - use bitmap API to initialize test bitmaps - as requested by Yury Norov, do not check the return value of bitmap_read(..., 0) - fix a compiler warning on 32-bit systems v5: - update patch title - address Yury Norov's comments: - rename the test cases - factor out test_bitmap_write_helper() to test writing over different background patterns; - add a test case copying a nontrivial value bit-by-bit; - drop volatile v4: - Address comments by Andy Shevchenko: added Reviewed-by: and a link to the previous discussion - Address comments by Yury Norov: - expand the bitmap to catch more corner cases - add code testing that bitmap_set_value() does not touch adjacent bits - add code testing the nbits=3D=3D0 case - rename bitmap_{get,set}_value() to bitmap_{read,write}() v3: - switch to using bitmap_{set,get}_value() - change the expected bit pattern in test_set_get_value(), as the test was incorrectly assuming 0 is the LSB. --- lib/test_bitmap.c | 179 ++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 172 insertions(+), 7 deletions(-) diff --git a/lib/test_bitmap.c b/lib/test_bitmap.c index 65f22c2578b06..46c0154680772 100644 --- a/lib/test_bitmap.c +++ b/lib/test_bitmap.c @@ -60,18 +60,17 @@ static const unsigned long exp3_1_0[] __initconst =3D { }; =20 static bool __init -__check_eq_uint(const char *srcfile, unsigned int line, - const unsigned int exp_uint, unsigned int x) +__check_eq_ulong(const char *srcfile, unsigned int line, + const unsigned long exp_ulong, unsigned long x) { - if (exp_uint !=3D x) { - pr_err("[%s:%u] expected %u, got %u\n", - srcfile, line, exp_uint, x); + if (exp_ulong !=3D x) { + pr_err("[%s:%u] expected %lu, got %lu\n", + srcfile, line, exp_ulong, x); return false; } return true; } =20 - static bool __init __check_eq_bitmap(const char *srcfile, unsigned int line, const unsigned long *exp_bmap, const unsigned long *bmap, @@ -185,7 +184,8 @@ __check_eq_str(const char *srcfile, unsigned int line, result; \ }) =20 -#define expect_eq_uint(...) __expect_eq(uint, ##__VA_ARGS__) +#define expect_eq_ulong(...) __expect_eq(ulong, ##__VA_ARGS__) +#define expect_eq_uint(x, y) expect_eq_ulong((unsigned int)(x), (unsigned= int)(y)) #define expect_eq_bitmap(...) __expect_eq(bitmap, ##__VA_ARGS__) #define expect_eq_pbl(...) __expect_eq(pbl, ##__VA_ARGS__) #define expect_eq_u32_array(...) __expect_eq(u32_array, ##__VA_ARGS__) @@ -1245,6 +1245,168 @@ static void __init test_bitmap_const_eval(void) BUILD_BUG_ON(~var !=3D ~BIT(25)); } =20 +/* + * Test bitmap should be big enough to include the cases when start is not= in + * the first word, and start+nbits lands in the following word. + */ +#define TEST_BIT_LEN (1000) + +/* + * Helper function to test bitmap_write() overwriting the chosen byte patt= ern. + */ +static void __init test_bitmap_write_helper(const char *pattern) +{ + DECLARE_BITMAP(bitmap, TEST_BIT_LEN); + DECLARE_BITMAP(exp_bitmap, TEST_BIT_LEN); + DECLARE_BITMAP(pat_bitmap, TEST_BIT_LEN); + unsigned long w, r, bit; + int i, n, nbits; + + /* + * Only parse the pattern once and store the result in the intermediate + * bitmap. + */ + bitmap_parselist(pattern, pat_bitmap, TEST_BIT_LEN); + + /* + * Check that writing a single bit does not accidentally touch the + * adjacent bits. + */ + for (i =3D 0; i < TEST_BIT_LEN; i++) { + bitmap_copy(bitmap, pat_bitmap, TEST_BIT_LEN); + bitmap_copy(exp_bitmap, pat_bitmap, TEST_BIT_LEN); + for (bit =3D 0; bit <=3D 1; bit++) { + bitmap_write(bitmap, bit, i, 1); + __assign_bit(i, exp_bitmap, bit); + expect_eq_bitmap(exp_bitmap, bitmap, + TEST_BIT_LEN); + } + } + + /* Ensure writing 0 bits does not change anything. */ + bitmap_copy(bitmap, pat_bitmap, TEST_BIT_LEN); + bitmap_copy(exp_bitmap, pat_bitmap, TEST_BIT_LEN); + for (i =3D 0; i < TEST_BIT_LEN; i++) { + bitmap_write(bitmap, ~0UL, i, 0); + expect_eq_bitmap(exp_bitmap, bitmap, TEST_BIT_LEN); + } + + for (nbits =3D BITS_PER_LONG; nbits >=3D 1; nbits--) { + w =3D IS_ENABLED(CONFIG_64BIT) ? 0xdeadbeefdeadbeefUL + : 0xdeadbeefUL; + w >>=3D (BITS_PER_LONG - nbits); + for (i =3D 0; i <=3D TEST_BIT_LEN - nbits; i++) { + bitmap_copy(bitmap, pat_bitmap, TEST_BIT_LEN); + bitmap_copy(exp_bitmap, pat_bitmap, TEST_BIT_LEN); + for (n =3D 0; n < nbits; n++) + __assign_bit(i + n, exp_bitmap, w & BIT(n)); + bitmap_write(bitmap, w, i, nbits); + expect_eq_bitmap(exp_bitmap, bitmap, TEST_BIT_LEN); + r =3D bitmap_read(bitmap, i, nbits); + expect_eq_ulong(r, w); + } + } +} + +static void __init test_bitmap_read_write(void) +{ + unsigned char *pattern[3] =3D {"", "all:1/2", "all"}; + DECLARE_BITMAP(bitmap, TEST_BIT_LEN); + unsigned long zero_bits =3D 0, bits_per_long =3D BITS_PER_LONG; + unsigned long val; + int i, pi; + + /* + * Reading/writing zero bits should not crash the kernel. + * READ_ONCE() prevents constant folding. + */ + bitmap_write(NULL, 0, 0, READ_ONCE(zero_bits)); + /* Return value of bitmap_read() is undefined here. */ + bitmap_read(NULL, 0, READ_ONCE(zero_bits)); + + /* + * Reading/writing more than BITS_PER_LONG bits should not crash the + * kernel. READ_ONCE() prevents constant folding. + */ + bitmap_write(NULL, 0, 0, READ_ONCE(bits_per_long) + 1); + /* Return value of bitmap_read() is undefined here. */ + bitmap_read(NULL, 0, READ_ONCE(bits_per_long) + 1); + + /* + * Ensure that bitmap_read() reads the same value that was previously + * written, and two consequent values are correctly merged. + * The resulting bit pattern is asymmetric to rule out possible issues + * with bit numeration order. + */ + for (i =3D 0; i < TEST_BIT_LEN - 7; i++) { + bitmap_zero(bitmap, TEST_BIT_LEN); + + bitmap_write(bitmap, 0b10101UL, i, 5); + val =3D bitmap_read(bitmap, i, 5); + expect_eq_ulong(0b10101UL, val); + + bitmap_write(bitmap, 0b101UL, i + 5, 3); + val =3D bitmap_read(bitmap, i + 5, 3); + expect_eq_ulong(0b101UL, val); + + val =3D bitmap_read(bitmap, i, 8); + expect_eq_ulong(0b10110101UL, val); + } + + for (pi =3D 0; pi < ARRAY_SIZE(pattern); pi++) + test_bitmap_write_helper(pattern[pi]); +} + +static void __init test_bitmap_read_perf(void) +{ + DECLARE_BITMAP(bitmap, TEST_BIT_LEN); + unsigned int cnt, nbits, i; + unsigned long val; + ktime_t time; + + bitmap_fill(bitmap, TEST_BIT_LEN); + time =3D ktime_get(); + for (cnt =3D 0; cnt < 5; cnt++) { + for (nbits =3D 1; nbits <=3D BITS_PER_LONG; nbits++) { + for (i =3D 0; i < TEST_BIT_LEN; i++) { + if (i + nbits > TEST_BIT_LEN) + break; + /* + * Prevent the compiler from optimizing away the + * bitmap_read() by using its value. + */ + WRITE_ONCE(val, bitmap_read(bitmap, i, nbits)); + } + } + } + time =3D ktime_get() - time; + pr_err("Time spent in %s:\t%llu\n", __func__, time); +} + +static void __init test_bitmap_write_perf(void) +{ + DECLARE_BITMAP(bitmap, TEST_BIT_LEN); + unsigned int cnt, nbits, i; + unsigned long val =3D 0xfeedface; + ktime_t time; + + bitmap_zero(bitmap, TEST_BIT_LEN); + time =3D ktime_get(); + for (cnt =3D 0; cnt < 5; cnt++) { + for (nbits =3D 1; nbits <=3D BITS_PER_LONG; nbits++) { + for (i =3D 0; i < TEST_BIT_LEN; i++) { + if (i + nbits > TEST_BIT_LEN) + break; + bitmap_write(bitmap, val, i, nbits); + } + } + } + time =3D ktime_get() - time; + pr_err("Time spent in %s:\t%llu\n", __func__, time); +} + +#undef TEST_BIT_LEN + static void __init selftest(void) { test_zero_clear(); @@ -1261,6 +1423,9 @@ static void __init selftest(void) test_bitmap_cut(); test_bitmap_print_buf(); test_bitmap_const_eval(); + test_bitmap_read_write(); + test_bitmap_read_perf(); + test_bitmap_write_perf(); =20 test_find_nth_bit(); test_for_each_set_bit(); --=20 2.43.0.472.g3155946c3a-goog From nobody Sat Dec 27 17:02:36 2025 Received: from mail-ed1-f73.google.com (mail-ed1-f73.google.com [209.85.208.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 532457EFD5 for ; Mon, 18 Dec 2023 12:40:48 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--glider.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="lxJ+dgTS" Received: by mail-ed1-f73.google.com with SMTP id 4fb4d7f45d1cf-55336b3d3c0so626597a12.1 for ; Mon, 18 Dec 2023 04:40:48 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1702903246; x=1703508046; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=Qs+j5zfhwbmFpwYioV+9DGlHFgYonyMMkWfDj06Vx9Q=; b=lxJ+dgTSRFj/xSbJScprKbh1a2XlGc83TFqd/5Hs/Q8UsEJ3YrXXguhHGzh8xSCM0H ypOBIXjnE8jQfJQRKP/4TqZxhAC/9lkvwjHN0U4HN3z3jpVKm2C2fPhBUlhq8P1A5M5V e4ATLEt8P0xfHPODKuKLc78ffWMR/NsSIg24iJKbQjVtm1ei4zsRBym2/dV9gVzDqIHN qTLvaHJkNATLi5YHHHubfYAEYK2KHBE1pEP0Z8ObJ6xLJAnQMBiV5439Wi+xlm1vtbf9 kUbWjzEIlqnV2kYDZ44okCIAJH9cZefPJw/nVIHIwlRCfPTWrqqEXkanj9oQpEBkFAX0 5ABQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1702903246; x=1703508046; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=Qs+j5zfhwbmFpwYioV+9DGlHFgYonyMMkWfDj06Vx9Q=; b=fdZwMmG+9OETF/uefS6+6/boiDD4/niHOmYb9ilakad/YMluFhA94ANww0Hs+WcgLt ylppPJn5G+Vdyo0zyJRYT3bcTgrs5hYHN4OxkJTpYFaez+hFwDLALGFVXVj6Cw88NCxH /J8bau8dAdHyH/KT5ga9romnELr2e9TDlwbg+ICMa6rNbz/lBD974ZSVLjFnsdeYexEZ RyOkYcZhU8cLOcXmqs82NT7VF+sHzXWMttM+0rgIJKQiYuDUVy0kaIzYGl0dDBajpghE gwkehp4iSxMLLHlkzK3GgB5IzareGgWGS3Nc86/nqd1rIGzEGZCs24FTepUVuJOkB5jr EwUA== X-Gm-Message-State: AOJu0YwKYeqdfNHUBochCKDW7DQhC5Wn15sh9N36qpR3oSD1HGI7Y/ZV SjdbuFyu0620Bn0uyZw5qh16tTGFyU0= X-Google-Smtp-Source: AGHT+IFQVHgRH93m/ueB3XJGh9TO8YJYZZ4U7vSweIWkP1eoMbz/HsT3jdJ+e4LMpQ/IE0AeW1HhdHhT8gY= X-Received: from glider.muc.corp.google.com ([2a00:79e0:9c:201:9a3f:c8e7:1395:806e]) (user=glider job=sendgmr) by 2002:a05:6402:3083:b0:552:fdb5:6de0 with SMTP id de3-20020a056402308300b00552fdb56de0mr30708edb.3.1702903246393; Mon, 18 Dec 2023 04:40:46 -0800 (PST) Date: Mon, 18 Dec 2023 13:40:29 +0100 In-Reply-To: <20231218124033.551770-1-glider@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20231218124033.551770-1-glider@google.com> X-Mailer: git-send-email 2.43.0.472.g3155946c3a-goog Message-ID: <20231218124033.551770-4-glider@google.com> Subject: [PATCH v11-mte 3/7] lib/test_bitmap: use pr_info() for non-error messages From: Alexander Potapenko To: glider@google.com, catalin.marinas@arm.com, will@kernel.org, pcc@google.com, andreyknvl@gmail.com, andriy.shevchenko@linux.intel.com, aleksander.lobakin@intel.com, linux@rasmusvillemoes.dk, yury.norov@gmail.com, alexandru.elisei@arm.com Cc: linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, eugenis@google.com, syednwaris@gmail.com, william.gray@linaro.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" pr_err() messages may be treated as errors by some log readers, so let us only use them for test failures. For non-error messages, replace them with pr_info(). Suggested-by: Alexander Lobakin Signed-off-by: Alexander Potapenko Acked-by: Yury Norov --- v11-mte: - add Yury's Acked-by: v10-mte: - send this patch together with the "Implement MTE tag compression for swapped pages" --- lib/test_bitmap.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/lib/test_bitmap.c b/lib/test_bitmap.c index 46c0154680772..a6e92cf5266af 100644 --- a/lib/test_bitmap.c +++ b/lib/test_bitmap.c @@ -507,7 +507,7 @@ static void __init test_bitmap_parselist(void) } =20 if (ptest.flags & PARSE_TIME) - pr_err("parselist: %d: input is '%s' OK, Time: %llu\n", + pr_info("parselist: %d: input is '%s' OK, Time: %llu\n", i, ptest.in, time); =20 #undef ptest @@ -546,7 +546,7 @@ static void __init test_bitmap_printlist(void) goto out; } =20 - pr_err("bitmap_print_to_pagebuf: input is '%s', Time: %llu\n", buf, time); + pr_info("bitmap_print_to_pagebuf: input is '%s', Time: %llu\n", buf, time= ); out: kfree(buf); kfree(bmap); @@ -624,7 +624,7 @@ static void __init test_bitmap_parse(void) } =20 if (test.flags & PARSE_TIME) - pr_err("parse: %d: input is '%s' OK, Time: %llu\n", + pr_info("parse: %d: input is '%s' OK, Time: %llu\n", i, test.in, time); } } @@ -1380,7 +1380,7 @@ static void __init test_bitmap_read_perf(void) } } time =3D ktime_get() - time; - pr_err("Time spent in %s:\t%llu\n", __func__, time); + pr_info("Time spent in %s:\t%llu\n", __func__, time); } =20 static void __init test_bitmap_write_perf(void) @@ -1402,7 +1402,7 @@ static void __init test_bitmap_write_perf(void) } } time =3D ktime_get() - time; - pr_err("Time spent in %s:\t%llu\n", __func__, time); + pr_info("Time spent in %s:\t%llu\n", __func__, time); } =20 #undef TEST_BIT_LEN --=20 2.43.0.472.g3155946c3a-goog From nobody Sat Dec 27 17:02:36 2025 Received: from mail-ej1-f74.google.com (mail-ej1-f74.google.com [209.85.218.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 28FFE80050 for ; Mon, 18 Dec 2023 12:40:50 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--glider.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="cO4xSfkk" Received: by mail-ej1-f74.google.com with SMTP id a640c23a62f3a-a1f8a2945b9so155956666b.1 for ; Mon, 18 Dec 2023 04:40:50 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1702903249; x=1703508049; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=YWJ+UsWuDCEsYPQzINzqcEMETNvxvT+vIGwoVECnsgk=; b=cO4xSfkkXygKkTANj0HB/bpFhcbFHhVi1plTaUE1ybuOD4EV+B4sWx11ouIXnaR2kN ci+E6gfSk7SoY8K8mELoRReF14ne0GvQ7H41C0Eu7z0G2O+bMBnc3oL5yIz+40HKu25S zDoqgADOvwyi8BeQo+xf/jqmGsLu9hFcPiqHn0H2rHxJoqpoRf4Hv4+Ehb8DP5QQVyqH r7jfYeMTvNl1iX6Y1Wa1fAhdTJ0oN/E52eIGPxSY5rJe8ByTYSUGWJUwH1wybAYo5gSJ L7HhY/TIOU/zBgNa6CuFI0K4vo9hD2aRsZfNEmSqHrtYd75URxukZ7QKGwr4OK39M6XF MJOA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1702903249; x=1703508049; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=YWJ+UsWuDCEsYPQzINzqcEMETNvxvT+vIGwoVECnsgk=; b=S9rNVvC9wDmxfBUnnMfmoa7HeEM1Q/m6LnqE43uOXja20VjrJzxUR8EnSepmixbd25 DwMNeDUFsxAHrxvqAXXOUNjW4/I0SsX8CGt+tFsHPHZoAmIndM4B9AqwFf4KnumVpOLD LtnSUEWtP0tyvY/T0j3fB+ZBRUEh+Zmhg6AIAO/bkTFNNVOUSP4NmBxLm6Kni0bQalLE CBQU1+KtuW5S4k7kWNwfJdt3jf2Fx1ouGgNdXFvE00tbSTJtMEOTZgFG5B+qUmRSh72H kkm3/10wx+b+okSinbczbIgsK+oDGPF+aV/RySe0o2FY6VIFOPK8UTA/OznIJ5qPSyDD yPLw== X-Gm-Message-State: AOJu0YxrXzrKN7D9flpwDD2zESkOOwsXB2LdQBtGmsZo0VGNQhstD0DK yZFDI6PW4bXlvzmJA2HtJEzoH09gFQ8= X-Google-Smtp-Source: AGHT+IFk2bVuPkH/VcYXhtgwc6MtqpNNZevWkOYDV5GgAqAe7uwRWop3Oy1lyAyKiCBYRWsG0r/M4K8fapE= X-Received: from glider.muc.corp.google.com ([2a00:79e0:9c:201:9a3f:c8e7:1395:806e]) (user=glider job=sendgmr) by 2002:a17:906:3745:b0:a1d:2f59:77e9 with SMTP id e5-20020a170906374500b00a1d2f5977e9mr68997ejc.6.1702903249379; Mon, 18 Dec 2023 04:40:49 -0800 (PST) Date: Mon, 18 Dec 2023 13:40:30 +0100 In-Reply-To: <20231218124033.551770-1-glider@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20231218124033.551770-1-glider@google.com> X-Mailer: git-send-email 2.43.0.472.g3155946c3a-goog Message-ID: <20231218124033.551770-5-glider@google.com> Subject: [PATCH v11-mte 4/7] arm64: mte: implement CONFIG_ARM64_MTE_COMP From: Alexander Potapenko To: glider@google.com, catalin.marinas@arm.com, will@kernel.org, pcc@google.com, andreyknvl@gmail.com, andriy.shevchenko@linux.intel.com, aleksander.lobakin@intel.com, linux@rasmusvillemoes.dk, yury.norov@gmail.com, alexandru.elisei@arm.com Cc: linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, eugenis@google.com, syednwaris@gmail.com, william.gray@linaro.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The config implements the algorithm compressing memory tags for ARM MTE during swapping. The algorithm is based on RLE and specifically targets buffers of tags corresponding to a single page. In many cases a buffer can be compressed into 63 bits, making it possible to store it without additional memory allocation. Suggested-by: Evgenii Stepanov Signed-off-by: Alexander Potapenko Acked-by: Catalin Marinas --- v11-mte: - Address Yury's comments: change variable name to reflow the code, add a comment v10-mte: - added Catalin's Acked-by: v8: - As suggested by Catalin Marinas, only compress tags if they can be stored inline. This simplifies the code drastically. - Update the documentation. - Split off patches introducing bitmap_read()/bitmap_write(). v6: - shuffle bits in inline handles so that they can't be confused with canonical pointers; - use kmem_cache_zalloc() to allocate compressed storage - correctly handle range size overflow - minor documentation fixes, clarify the special cases v5: - make code PAGE_SIZE-agnostic, remove hardcoded constants, updated the docs - implement debugfs interface - Address comments by Andy Shevchenko: - update description of mtecomp.c - remove redundant assignments, simplify mte_tags_to_ranges() - various code simplifications - introduce mtecomp.h - add :export: to Documentation/arch/arm64/mte-tag-compression.rst v4: - Addressed comments by Andy Shevchenko: - expanded "MTE" to "Memory Tagging Extension" in Kconfig - fixed kernel-doc comments, moved them to C source - changed variables to unsigned where applicable - some code simplifications, fewer unnecessary assignments - added the mte_largest_idx_bits() helper - added namespace prefixes to all functions - added missing headers (but removed bits.h) - Addressed comments by Yury Norov: - removed test-only functions from mtecomp.h - dropped the algoritm name (all functions are now prefixed with "mte") - added more comments - got rid of MTE_RANGES_INLINE - renamed bitmap_{get,set}_value() to bitmap_{read,write}() - moved the big comment explaining the algorithm to Documentation/arch/arm64/mte-tag-compression.rst, expanded it, add a link to it from Documentation/arch/arm64/index.rst - removed hardcoded ranges from mte_alloc_size()/mte_size_to_ranges() v3: - Addressed comments by Andy Shevchenko: - use bitmap_{set,get}_value() writte by Syed Nayyar Waris - switched to unsigned long everywhere (fewer casts) - simplified the code, removed redundant checks - dropped ea0_compress_inline() - added bit size constants and helpers to access the bitmap - explicitly initialize all compressed sizes in ea0_compress_to_buf() - initialize all handle bits v2: - as suggested by Yury Norov, switched from struct bitq (which is not needed anymore) to - add missing symbol exports fixup CONFIG_ARM64_MTE_COMP --- Documentation/arch/arm64/index.rst | 1 + .../arch/arm64/mte-tag-compression.rst | 154 +++++++++++ arch/arm64/Kconfig | 11 + arch/arm64/include/asm/mtecomp.h | 39 +++ arch/arm64/mm/Makefile | 1 + arch/arm64/mm/mtecomp.c | 260 ++++++++++++++++++ arch/arm64/mm/mtecomp.h | 12 + 7 files changed, 478 insertions(+) create mode 100644 Documentation/arch/arm64/mte-tag-compression.rst create mode 100644 arch/arm64/include/asm/mtecomp.h create mode 100644 arch/arm64/mm/mtecomp.c create mode 100644 arch/arm64/mm/mtecomp.h diff --git a/Documentation/arch/arm64/index.rst b/Documentation/arch/arm64/= index.rst index d08e924204bf1..bf6c1583233a9 100644 --- a/Documentation/arch/arm64/index.rst +++ b/Documentation/arch/arm64/index.rst @@ -19,6 +19,7 @@ ARM64 Architecture legacy_instructions memory memory-tagging-extension + mte-tag-compression perf pointer-authentication ptdump diff --git a/Documentation/arch/arm64/mte-tag-compression.rst b/Documentati= on/arch/arm64/mte-tag-compression.rst new file mode 100644 index 0000000000000..8fe6b51a9db6d --- /dev/null +++ b/Documentation/arch/arm64/mte-tag-compression.rst @@ -0,0 +1,154 @@ +.. SPDX-License-Identifier: GPL-2.0 + +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D +Tag Compression for Memory Tagging Extension (MTE) +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D + +This document describes the algorithm used to compress memory tags used by= the +ARM Memory Tagging Extension (MTE). + +Introduction +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +MTE assigns tags to memory pages: for 4K pages those tags occupy 128 bytes +(256 4-bit tags each corresponding to a 16-byte MTE granule), for 16K page= s - +512 bytes, for 64K pages - 2048 bytes. By default, MTE carves out 3.125% (= 1/16) +of the available physical memory to store the tags. + +When MTE pages are saved to swap, their tags need to be stored in the kern= el +memory. If the system swap is used heavily, these tags may take a substant= ial +portion of the physical memory. To reduce memory waste, ``CONFIG_ARM64_MTE= _COMP`` +allows the kernel to store the tags in compressed form. + +Implementation details +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +The algorithm attempts to compress an array of ``MTE_PAGE_TAG_STORAGE`` +tag bytes into a byte sequence that can be stored in an 8-byte pointer. If= that +is not possible, the data is stored uncompressed. + +Tag manipulation and storage +---------------------------- + +Tags for swapped pages are stored in an XArray that maps swap entries to 6= 3-bit +values (see ``arch/arm64/mm/mteswap.c``). Bit 0 of these values indicates = how +their contents should be treated: + + - 0: value is a pointer to an uncompressed buffer allocated with kmalloc() + (always the case if ``CONFIG_ARM64_MTE_COMP=3Dn``) with the highest bit= set + to 0; + - 1: value contains compressed data. + +``arch/arm64/include/asm/mtecomp.h`` declares the following functions that +manipulate with tags: + +- mte_compress() - compresses the given ``MTE_PAGE_TAG_STORAGE``-byte ``ta= gs`` + buffer into a pointer; +- mte_decompress() - decompresses the tags from a pointer; +- mte_is_compressed() - returns ``true`` iff the pointer passed to it shou= ld be + treated as compressed data. + +Tag compression +--------------- + +The compression algorithm is a variation of RLE (run-length encoding) and = works +as follows (we will be considering 4K pages and 128-byte tag buffers, but = the +same approach scales to 16K and 64K pages): + +1. The input array of 128 (``MTE_PAGE_TAG_STORAGE``) bytes is transformed = into + tag ranges (two arrays: ``r_tags[]`` containing tag values and ``r_size= s[]`` + containing range lengths) by mte_tags_to_ranges(). Note that + ``r_sizes[]`` sums up to 256 (``MTE_GRANULES_PER_PAGE``). + + If ``r_sizes[]`` consists of a single element + (``{ MTE_GRANULES_PER_PAGE }``), the corresponding range is split into = two + halves, i.e.:: + + r_sizes_new[2] =3D { MTE_GRANULES_PER_PAGE/2, MTE_GRANULES_PER_PAGE/2= }; + r_tags_new[2] =3D { r_tags[0], r_tags[0] }; + +2. The number of the largest element of ``r_sizes[]`` is stored in + ``largest_idx``. The element itself is thrown away from ``r_sizes[]``, + because it can be reconstructed from the sum of the remaining elements.= Note + that now none of the remaining ``r_sizes[]`` elements exceeds + ``MTE_GRANULES_PER_PAGE/2``. + +3. If the number ``N`` of ranges does not exceed ``6``, the ranges can be + compressed into 64 bits. This is done by storing the following values p= acked + into the pointer (``i`` means a ````-bit unsigned integer) + treated as a bitmap (see ``include/linux/bitmap.h``):: + + bit 0 : (always 1) : i1 + bits 1-3 : largest_idx : i3 + bits 4-27 : r_tags[0..5] : i4 x 6 + bits 28-62 : r_sizes[0..4] : i7 x 5 + bit 63 : (always 0) : i1 + + If N is less than 6, ``r_tags`` and ``r_sizes`` are padded up with zero + values. The unused bits in the pointer, including bit 63, are also set = to 0, + so the compressed data can be stored in XArray. + + Range size of ``MTE_GRANULES_PER_PAGE/2`` (at most one) does not fit in= to + i7 and will be written as 0. This case is handled separately by the + decompressing procedure. + +Tag decompression +----------------- + +The decompression algorithm performs the steps below. + +1. Read the lowest bit of the data from the input buffer and check that it= is 1, + otherwise bail out. + +2. Read ``largest_idx``, ``r_tags[]`` and ``r_sizes[]`` from the + input buffer. + + If ``largest_idx`` is zero, and all ``r_sizes[]`` are zero, set + ``r_sizes[0] =3D MTE_GRANULES_PER_PAGE/2``. + + Calculate the removed largest element of ``r_sizes[]`` as + ``largest =3D 256 - sum(r_sizes)`` and insert it into ``r_sizes`` at + position ``largest_idx``. + +6. For each ``r_sizes[i] > 0``, add a 4-bit value ``r_tags[i]`` to the out= put + buffer ``r_sizes[i]`` times. + + +Why these numbers? +------------------ + +To be able to reconstruct ``N`` tag ranges from the compressed data, we ne= ed to +store the indicator bit together with ``largest_idx``, ``r_tags[N]``, and +``r_sizes[N-1]`` in 63 bits. +Knowing that the sizes do not exceed ``MTE_PAGE_TAG_STORAGE``, each of the= m can be +packed into ``S =3D ilog2(MTE_PAGE_TAG_STORAGE)`` bits, whereas a single t= ag occupies +4 bits. + +It is evident that the number of ranges that can be stored in 63 bits is +strictly less than 8, therefore we only need 3 bits to store ``largest_idx= ``. + +The maximum values of ``N`` so that the number ``1 + 3 + N * 4 + (N-1) * S= `` of +storage bits does not exceed 63, are shown in the table below:: + + +-----------+-----------------+----+---+-------------------+ + | Page size | Tag buffer size | S | N | Storage bits | + +-----------+-----------------+----+---+-------------------+ + | 4 KB | 128 B | 7 | 6 | 63 =3D 1+3+6*4+5*7 | + | 16 KB | 512 B | 9 | 5 | 60 =3D 1+3+5*4+4*9 | + | 64 KB | 2048 B | 11 | 4 | 53 =3D 1+3+4*4+3*11 | + +-----------+-----------------+----+---+-------------------+ + +Note +---- + +Tag compression and decompression implicitly rely on the fixed MTE tag size +(4 bits) and number of tags per page. Should these values change, the algo= rithm +may need to be revised. + + +Programming Interface +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + + .. kernel-doc:: arch/arm64/include/asm/mtecomp.h + .. kernel-doc:: arch/arm64/mm/mtecomp.c + :export: diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index 7b071a00425d2..5f4d4b49a512e 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -2078,6 +2078,17 @@ config ARM64_EPAN if the cpu does not implement the feature. endmenu # "ARMv8.7 architectural features" =20 +config ARM64_MTE_COMP + bool "Tag compression for ARM64 Memory Tagging Extension" + default y + depends on ARM64_MTE + help + Enable tag compression support for ARM64 Memory Tagging Extension. + + Tag buffers corresponding to swapped RAM pages are compressed using + RLE to conserve heap memory. In the common case compressed tags + occupy 2.5x less memory. + config ARM64_SVE bool "ARM Scalable Vector Extension support" default y diff --git a/arch/arm64/include/asm/mtecomp.h b/arch/arm64/include/asm/mtec= omp.h new file mode 100644 index 0000000000000..b9a3a921a38d4 --- /dev/null +++ b/arch/arm64/include/asm/mtecomp.h @@ -0,0 +1,39 @@ +/* SPDX-License-Identifier: GPL-2.0 */ + +#ifndef __ASM_MTECOMP_H +#define __ASM_MTECOMP_H + +#include + +/** + * mte_is_compressed() - check if the supplied pointer contains compressed= tags. + * @ptr: pointer returned by kmalloc() or mte_compress(). + * + * Returns: true iff bit 0 of @ptr is 1, which is only possible if @ptr was + * returned by mte_is_compressed(). + */ +static inline bool mte_is_compressed(void *ptr) +{ + return ((unsigned long)ptr & 1); +} + +#if defined(CONFIG_ARM64_MTE_COMP) + +void *mte_compress(u8 *tags); +bool mte_decompress(void *handle, u8 *tags); + +#else + +static inline void *mte_compress(u8 *tags) +{ + return NULL; +} + +static inline bool mte_decompress(void *data, u8 *tags) +{ + return false; +} + +#endif // CONFIG_ARM64_MTE_COMP + +#endif // __ASM_MTECOMP_H diff --git a/arch/arm64/mm/Makefile b/arch/arm64/mm/Makefile index dbd1bc95967d0..46778f6dd83c2 100644 --- a/arch/arm64/mm/Makefile +++ b/arch/arm64/mm/Makefile @@ -10,6 +10,7 @@ obj-$(CONFIG_TRANS_TABLE) +=3D trans_pgd.o obj-$(CONFIG_TRANS_TABLE) +=3D trans_pgd-asm.o obj-$(CONFIG_DEBUG_VIRTUAL) +=3D physaddr.o obj-$(CONFIG_ARM64_MTE) +=3D mteswap.o +obj-$(CONFIG_ARM64_MTE_COMP) +=3D mtecomp.o KASAN_SANITIZE_physaddr.o +=3D n =20 obj-$(CONFIG_KASAN) +=3D kasan_init.o diff --git a/arch/arm64/mm/mtecomp.c b/arch/arm64/mm/mtecomp.c new file mode 100644 index 0000000000000..bb5cbd3edb5ba --- /dev/null +++ b/arch/arm64/mm/mtecomp.c @@ -0,0 +1,260 @@ +// SPDX-License-Identifier: GPL-2.0-only + +/* + * MTE tag compression algorithm. + * See Documentation/arch/arm64/mte-tag-compression.rst for more details. + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include + +#include "mtecomp.h" + +#define MTE_BITS_PER_LARGEST_IDX 3 +/* Range size cannot exceed MTE_GRANULES_PER_PAGE / 2. */ +#define MTE_BITS_PER_SIZE (ilog2(MTE_GRANULES_PER_PAGE) - 1) + +/* + * See Documentation/arch/arm64/mte-tag-compression.rst for details on how= the + * maximum number of ranges is calculated. + */ +#if defined(CONFIG_ARM64_4K_PAGES) +#define MTE_MAX_RANGES 6 +#elif defined(CONFIG_ARM64_16K_PAGES) +#define MTE_MAX_RANGES 5 +#else +#define MTE_MAX_RANGES 4 +#endif + +/** + * mte_tags_to_ranges() - break @tags into arrays of tag ranges. + * @tags: MTE_GRANULES_PER_PAGE-byte array containing MTE tags. + * @out_tags: u8 array to store the tag of every range. + * @out_sizes: unsigned short array to store the size of every range. + * @out_len: length of @out_tags and @out_sizes (output parameter, initial= ly + * equal to lengths of out_tags[] and out_sizes[]). + * + * This function is exported for testing purposes. + */ +void mte_tags_to_ranges(u8 *tags, u8 *out_tags, unsigned short *out_sizes, + size_t *out_len) +{ + u8 prev_tag =3D tags[0] / 16; /* First tag in the array. */ + unsigned int cur_idx =3D 0, i, j; + u8 cur_tag; + + memset(out_tags, 0, array_size(*out_len, sizeof(*out_tags))); + memset(out_sizes, 0, array_size(*out_len, sizeof(*out_sizes))); + + out_tags[cur_idx] =3D prev_tag; + for (i =3D 0; i < MTE_GRANULES_PER_PAGE; i++) { + j =3D i % 2; + cur_tag =3D j ? (tags[i / 2] % 16) : (tags[i / 2] / 16); + if (cur_tag =3D=3D prev_tag) { + out_sizes[cur_idx]++; + } else { + cur_idx++; + prev_tag =3D cur_tag; + out_tags[cur_idx] =3D prev_tag; + out_sizes[cur_idx] =3D 1; + } + } + *out_len =3D cur_idx + 1; +} +EXPORT_SYMBOL_NS(mte_tags_to_ranges, MTECOMP); + +/** + * mte_ranges_to_tags() - fill @tags using given tag ranges. + * @r_tags: u8[] containing the tag of every range. + * @r_sizes: unsigned short[] containing the size of every range. + * @r_len: length of @r_tags and @r_sizes. + * @tags: MTE_GRANULES_PER_PAGE-byte array to write the tags to. + * + * This function is exported for testing purposes. + */ +void mte_ranges_to_tags(u8 *r_tags, unsigned short *r_sizes, size_t r_len, + u8 *tags) +{ + unsigned int i, j, pos =3D 0; + u8 prev; + + for (i =3D 0; i < r_len; i++) { + for (j =3D 0; j < r_sizes[i]; j++) { + if (pos % 2) + tags[pos / 2] =3D (prev << 4) | r_tags[i]; + else + prev =3D r_tags[i]; + pos++; + } + } +} +EXPORT_SYMBOL_NS(mte_ranges_to_tags, MTECOMP); + +static void mte_bitmap_write(unsigned long *bitmap, unsigned long value, + unsigned long *pos, unsigned long bits) +{ + bitmap_write(bitmap, value, *pos, bits); + *pos +=3D bits; +} + +/* Compress ranges into an unsigned long. */ +static void mte_compress_to_ulong(size_t len, u8 *tags, unsigned short *si= zes, + unsigned long *result) +{ + unsigned long bit_pos =3D 0; + unsigned int largest_idx, i; + unsigned short largest =3D 0; + + for (i =3D 0; i < len; i++) { + if (sizes[i] > largest) { + largest =3D sizes[i]; + largest_idx =3D i; + } + } + /* Bit 1 in position 0 indicates compressed data. */ + mte_bitmap_write(result, 1, &bit_pos, 1); + mte_bitmap_write(result, largest_idx, &bit_pos, + MTE_BITS_PER_LARGEST_IDX); + for (i =3D 0; i < len; i++) + mte_bitmap_write(result, tags[i], &bit_pos, MTE_TAG_SIZE); + if (len =3D=3D 1) { + /* + * We are compressing MTE_GRANULES_PER_PAGE of identical tags. + * Split it into two ranges containing + * MTE_GRANULES_PER_PAGE / 2 tags, so that it falls into the + * special case described below. + */ + mte_bitmap_write(result, tags[0], &bit_pos, MTE_TAG_SIZE); + i =3D 2; + } else { + i =3D len; + } + for (; i < MTE_MAX_RANGES; i++) + mte_bitmap_write(result, 0, &bit_pos, MTE_TAG_SIZE); + /* + * Most of the time sizes[i] fits into MTE_BITS_PER_SIZE, apart from a + * special case when: + * len =3D 2; + * sizes =3D { MTE_GRANULES_PER_PAGE / 2, MTE_GRANULES_PER_PAGE / 2}; + * In this case largest_idx will be set to 0, and the size written to + * the bitmap will be also 0. + */ + for (i =3D 0; i < len; i++) { + if (i !=3D largest_idx) + mte_bitmap_write(result, sizes[i], &bit_pos, + MTE_BITS_PER_SIZE); + } + for (i =3D len; i < MTE_MAX_RANGES; i++) + mte_bitmap_write(result, 0, &bit_pos, MTE_BITS_PER_SIZE); +} + +/** + * mte_compress() - compress the given tag array. + * @tags: MTE_GRANULES_PER_PAGE-byte array to read the tags from. + * + * Attempts to compress the user-supplied tag array. + * + * Returns: compressed data or NULL. + */ +void *mte_compress(u8 *tags) +{ + unsigned short *r_sizes; + void *result =3D NULL; + u8 *r_tags; + size_t r_len; + + r_sizes =3D kmalloc_array(MTE_GRANULES_PER_PAGE, sizeof(unsigned short), + GFP_KERNEL); + r_tags =3D kmalloc(MTE_GRANULES_PER_PAGE, GFP_KERNEL); + if (!r_sizes || !r_tags) + goto ret; + r_len =3D MTE_GRANULES_PER_PAGE; + mte_tags_to_ranges(tags, r_tags, r_sizes, &r_len); + if (r_len <=3D MTE_MAX_RANGES) + mte_compress_to_ulong(r_len, r_tags, r_sizes, + (unsigned long *)&result); +ret: + kfree(r_tags); + kfree(r_sizes); + return result; +} +EXPORT_SYMBOL_NS(mte_compress, MTECOMP); + +static unsigned long mte_bitmap_read(const unsigned long *bitmap, + unsigned long *pos, unsigned long bits) +{ + unsigned long start =3D *pos; + + *pos +=3D bits; + return bitmap_read(bitmap, start, bits); +} + +/** + * mte_decompress() - decompress the tag array from the given pointer. + * @data: pointer returned by @mte_compress() + * @tags: MTE_GRANULES_PER_PAGE-byte array to write the tags to. + * + * Reads the compressed data and writes it into the user-supplied tag arra= y. + * + * Returns: true on success, false if the passed data is uncompressed. + */ +bool mte_decompress(void *data, u8 *tags) +{ + unsigned short r_sizes[MTE_MAX_RANGES]; + u8 r_tags[MTE_MAX_RANGES]; + unsigned int largest, i; + unsigned long bit_pos =3D 0; + unsigned long *bitmap; + unsigned short sum; + size_t max_ranges; + + if (!mte_is_compressed(data)) + return false; + + /* + * @data contains compressed data encoded in the pointer iteself. + * Treat its contents as a bitmap. + */ + bitmap =3D (unsigned long *)&data; + max_ranges =3D MTE_MAX_RANGES; + /* Skip the leading bit indicating the inline case. */ + mte_bitmap_read(bitmap, &bit_pos, 1); + largest =3D mte_bitmap_read(bitmap, &bit_pos, MTE_BITS_PER_LARGEST_IDX); + if (largest >=3D MTE_MAX_RANGES) + return false; + + for (i =3D 0; i < max_ranges; i++) + r_tags[i] =3D mte_bitmap_read(bitmap, &bit_pos, MTE_TAG_SIZE); + for (i =3D 0, sum =3D 0; i < max_ranges; i++) { + if (i =3D=3D largest) + continue; + r_sizes[i] =3D + mte_bitmap_read(bitmap, &bit_pos, MTE_BITS_PER_SIZE); + /* + * Special case: tag array consists of two ranges of + * `MTE_GRANULES_PER_PAGE / 2` tags. + */ + if ((largest =3D=3D 0) && (i =3D=3D 1) && (r_sizes[i] =3D=3D 0)) + r_sizes[i] =3D MTE_GRANULES_PER_PAGE / 2; + if (!r_sizes[i]) { + max_ranges =3D i; + break; + } + sum +=3D r_sizes[i]; + } + if (sum >=3D MTE_GRANULES_PER_PAGE) + return false; + r_sizes[largest] =3D MTE_GRANULES_PER_PAGE - sum; + mte_ranges_to_tags(r_tags, r_sizes, max_ranges, tags); + return true; +} +EXPORT_SYMBOL_NS(mte_decompress, MTECOMP); diff --git a/arch/arm64/mm/mtecomp.h b/arch/arm64/mm/mtecomp.h new file mode 100644 index 0000000000000..b94cf0384f2af --- /dev/null +++ b/arch/arm64/mm/mtecomp.h @@ -0,0 +1,12 @@ +/* SPDX-License-Identifier: GPL-2.0 */ + +#ifndef ARCH_ARM64_MM_MTECOMP_H_ +#define ARCH_ARM64_MM_MTECOMP_H_ + +/* Functions exported from mtecomp.c for test_mtecomp.c. */ +void mte_tags_to_ranges(u8 *tags, u8 *out_tags, unsigned short *out_sizes, + size_t *out_len); +void mte_ranges_to_tags(u8 *r_tags, unsigned short *r_sizes, size_t r_len, + u8 *tags); + +#endif // ARCH_ARM64_MM_TEST_MTECOMP_H_ --=20 2.43.0.472.g3155946c3a-goog From nobody Sat Dec 27 17:02:36 2025 Received: from mail-ej1-f73.google.com (mail-ej1-f73.google.com [209.85.218.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EAB757EFA0 for ; Mon, 18 Dec 2023 12:40:53 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--glider.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="MrVukIOg" Received: by mail-ej1-f73.google.com with SMTP id a640c23a62f3a-a1f8a2945b9so155958766b.1 for ; Mon, 18 Dec 2023 04:40:53 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1702903252; x=1703508052; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=Z23ZDr01NkgZyjyXcvWE5FnYbJhU70TfuuBALSdMK5s=; b=MrVukIOgOQdNYIGjwJdps2FZ4rt4uEHNhELdNfT7oIZByEEdHIR8zTR+MNxPWwt+Ta SO7/uEevGAWtrUTpocY3NsCEQGHycI9vxItR888uc3IdN9NM+IrI8041U9FgUpbBjgHp NAEDX7kGpb533xKLnetKXQAe4+SvnBlq3/Pqdq+xM8tdKokCpFLrOpE7GrOqbZebqSNi NJf0jSv9znp8TIItD+p5htk7RbSMa5W9JuFD/73wvmKChccC15yubXqvIuDq+sR9JX3U +9R57kFHDpEGU/DY9Vib1yr2mQxT+K5Jcty8i4mNaYHHygzpFmyWCMqt9e88Zr9ApVZz areg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1702903252; x=1703508052; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=Z23ZDr01NkgZyjyXcvWE5FnYbJhU70TfuuBALSdMK5s=; b=S4MVr3seSULNs10V7txgFnHmN2qORllKEIYSHUIidI0CnKk8ETZvq8ZwJIafCk3AMV abZBw5ukV9ZrZp6BblrVmyQl1gV/M2s607vUNnBLWN7i8chIjpNqkXgnXfk/+PsrPhZ7 ePI6zCqep0TOtk0gHlDdKw2oA0TXeI3Xjg9G8yBu9k2xxplfi//3aXntuIGIcoRFHrMj A/p77RN4e0RVyDXdh/d2e/xP33j/UxRHYDDasTe9Zx5q/6gjNNPKcel0Shf77m10NqcD CI8f9n6Q0DZfscLrmhG05bbYyqRFi2UDulhy70C+YelkZPBMD7QVFqihn6X2bq4i/ND9 6N8A== X-Gm-Message-State: AOJu0Yy4U1Ol/M7U01CD6Ate+SNaNwpsQis/39nN5UvTWdGh/u+U1wvS GRfp4CvM6GpeMXHaiCBsqwtuxt1/NqQ= X-Google-Smtp-Source: AGHT+IEG7SZm39BUnUDqFuUftt1CnoHdSRtMm7716Xofyrc+xyE3calc3n/Yqi9+Gb/IEzzTIXe9/2MrRSM= X-Received: from glider.muc.corp.google.com ([2a00:79e0:9c:201:9a3f:c8e7:1395:806e]) (user=glider job=sendgmr) by 2002:a17:906:cf89:b0:a23:46b9:9957 with SMTP id um9-20020a170906cf8900b00a2346b99957mr9871ejb.5.1702903252167; Mon, 18 Dec 2023 04:40:52 -0800 (PST) Date: Mon, 18 Dec 2023 13:40:31 +0100 In-Reply-To: <20231218124033.551770-1-glider@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20231218124033.551770-1-glider@google.com> X-Mailer: git-send-email 2.43.0.472.g3155946c3a-goog Message-ID: <20231218124033.551770-6-glider@google.com> Subject: [PATCH v11-mte 5/7] arm64: mte: add a test for MTE tags compression From: Alexander Potapenko To: glider@google.com, catalin.marinas@arm.com, will@kernel.org, pcc@google.com, andreyknvl@gmail.com, andriy.shevchenko@linux.intel.com, aleksander.lobakin@intel.com, linux@rasmusvillemoes.dk, yury.norov@gmail.com, alexandru.elisei@arm.com Cc: linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, eugenis@google.com, syednwaris@gmail.com, william.gray@linaro.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Ensure that tag sequences containing alternating values are compressed to buffers of expected size and correctly decompressed afterwards. Signed-off-by: Alexander Potapenko Acked-by: Catalin Marinas --- v10-mte: - added Catalin's Acked-by: v9: - minor changes to Kconfig description v8: - adapt to the simplified compression algorithm v6: - add test_decompress_invalid() to ensure invalid handles are ignored; - add test_upper_bits(), which is a regression test for a case where an inline handle looked like an out-of-line one; - add test_compress_nonzero() to ensure a full nonzero tag array is compressed correctly; - add test_two_ranges() to test cases when the input buffer is divided into two ranges. v5: - remove hardcoded constants, added test setup/teardown; - support 16- and 64K pages; - replace nested if-clauses with expected_size_from_ranges(); - call mte_release_handle() after tests that perform compression/decompression; - address comments by Andy Shevchenko: - fix include order; - use mtecomp.h instead of function prototypes. v4: - addressed comments by Andy Shevchenko: - expanded MTE to "Memory Tagging Extension" in Kconfig - changed signed variables to unsigned where applicable - added missing header dependencies - addressed comments by Yury Norov: - moved test-only declarations from mtecomp.h into this test - switched to the new "mte"-prefixed function names, dropped the mentions of "EA0" - added test_tag_to_ranges_n() v3: - addressed comments by Andy Shevchenko in another patch: - switched from u64 to unsigned long - added MODULE_IMPORT_NS(MTECOMP) - fixed includes order --- arch/arm64/Kconfig | 11 ++ arch/arm64/mm/Makefile | 1 + arch/arm64/mm/test_mtecomp.c | 364 +++++++++++++++++++++++++++++++++++ 3 files changed, 376 insertions(+) create mode 100644 arch/arm64/mm/test_mtecomp.c diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index 5f4d4b49a512e..6a1397a96f2f0 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -2089,6 +2089,17 @@ config ARM64_MTE_COMP RLE to conserve heap memory. In the common case compressed tags occupy 2.5x less memory. =20 +config ARM64_MTE_COMP_KUNIT_TEST + tristate "Test tag compression for ARM64 Memory Tagging Extension" if !KU= NIT_ALL_TESTS + default KUNIT_ALL_TESTS + depends on KUNIT && ARM64_MTE_COMP + help + Test MTE compression algorithm enabled by CONFIG_ARM64_MTE_COMP. + + Ensure that certain tag sequences containing alternating values can + be compressed into pointer-size values and correctly decompressed + afterwards. + config ARM64_SVE bool "ARM Scalable Vector Extension support" default y diff --git a/arch/arm64/mm/Makefile b/arch/arm64/mm/Makefile index 46778f6dd83c2..170dc62b010b9 100644 --- a/arch/arm64/mm/Makefile +++ b/arch/arm64/mm/Makefile @@ -11,6 +11,7 @@ obj-$(CONFIG_TRANS_TABLE) +=3D trans_pgd-asm.o obj-$(CONFIG_DEBUG_VIRTUAL) +=3D physaddr.o obj-$(CONFIG_ARM64_MTE) +=3D mteswap.o obj-$(CONFIG_ARM64_MTE_COMP) +=3D mtecomp.o +obj-$(CONFIG_ARM64_MTE_COMP_KUNIT_TEST) +=3D test_mtecomp.o KASAN_SANITIZE_physaddr.o +=3D n =20 obj-$(CONFIG_KASAN) +=3D kasan_init.o diff --git a/arch/arm64/mm/test_mtecomp.c b/arch/arm64/mm/test_mtecomp.c new file mode 100644 index 0000000000000..e8aeb7607ff41 --- /dev/null +++ b/arch/arm64/mm/test_mtecomp.c @@ -0,0 +1,364 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Test cases for MTE tags compression algorithm. + */ + +#include +#include +#include +#include +#include + +#include + +#include + +#include "mtecomp.h" + +/* Per-test storage allocated in mtecomp_test_init(). */ +struct test_data { + u8 *tags, *dtags; + unsigned short *r_sizes; + size_t r_len; + u8 *r_tags; +}; + +/* + * Split td->tags to ranges stored in td->r_tags, td->r_sizes, td->r_len, + * then convert those ranges back to tags stored in td->dtags. + */ +static void tags_to_ranges_to_tags_helper(struct kunit *test) +{ + struct test_data *td =3D test->priv; + + mte_tags_to_ranges(td->tags, td->r_tags, td->r_sizes, &td->r_len); + mte_ranges_to_tags(td->r_tags, td->r_sizes, td->r_len, td->dtags); + KUNIT_EXPECT_EQ(test, memcmp(td->tags, td->dtags, MTE_PAGE_TAG_STORAGE), + 0); +} + +/* + * Test that mte_tags_to_ranges() produces a single range for a zero-fille= d tag + * buffer. + */ +static void test_tags_to_ranges_zero(struct kunit *test) +{ + struct test_data *td =3D test->priv; + + memset(td->tags, 0, MTE_PAGE_TAG_STORAGE); + tags_to_ranges_to_tags_helper(test); + + KUNIT_EXPECT_EQ(test, td->r_len, 1); + KUNIT_EXPECT_EQ(test, td->r_tags[0], 0); + KUNIT_EXPECT_EQ(test, td->r_sizes[0], MTE_GRANULES_PER_PAGE); +} + +/* + * Test that a small number of different tags is correctly transformed into + * ranges. + */ +static void test_tags_to_ranges_simple(struct kunit *test) +{ + struct test_data *td =3D test->priv; + const u8 ex_tags[] =3D { 0xa, 0x0, 0xa, 0xb, 0x0 }; + const unsigned short ex_sizes[] =3D { 1, 2, 2, 1, + MTE_GRANULES_PER_PAGE - 6 }; + + memset(td->tags, 0, MTE_PAGE_TAG_STORAGE); + td->tags[0] =3D 0xa0; + td->tags[1] =3D 0x0a; + td->tags[2] =3D 0xab; + tags_to_ranges_to_tags_helper(test); + + KUNIT_EXPECT_EQ(test, td->r_len, 5); + KUNIT_EXPECT_EQ(test, memcmp(td->r_tags, ex_tags, sizeof(ex_tags)), 0); + KUNIT_EXPECT_EQ(test, memcmp(td->r_sizes, ex_sizes, sizeof(ex_sizes)), + 0); +} + +/* Test that repeated 0xa0 byte produces MTE_GRANULES_PER_PAGE ranges of l= ength 1. */ +static void test_tags_to_ranges_repeated(struct kunit *test) +{ + struct test_data *td =3D test->priv; + + memset(td->tags, 0xa0, MTE_PAGE_TAG_STORAGE); + tags_to_ranges_to_tags_helper(test); + + KUNIT_EXPECT_EQ(test, td->r_len, MTE_GRANULES_PER_PAGE); +} + +/* Generate a buffer that will contain @nranges of tag ranges. */ +static void gen_tag_range_helper(u8 *tags, int nranges) +{ + unsigned int i; + + memset(tags, 0, MTE_PAGE_TAG_STORAGE); + if (nranges > 1) { + nranges--; + for (i =3D 0; i < nranges / 2; i++) + tags[i] =3D 0xab; + if (nranges % 2) + tags[nranges / 2] =3D 0xa0; + } +} + +/* + * Test that mte_tags_to_ranges()/mte_ranges_to_tags() work for various + * r_len values. + */ +static void test_tag_to_ranges_n(struct kunit *test) +{ + struct test_data *td =3D test->priv; + unsigned int i, j, sum; + + for (i =3D 1; i <=3D MTE_GRANULES_PER_PAGE; i++) { + gen_tag_range_helper(td->tags, i); + tags_to_ranges_to_tags_helper(test); + sum =3D 0; + for (j =3D 0; j < td->r_len; j++) + sum +=3D td->r_sizes[j]; + KUNIT_EXPECT_EQ(test, sum, MTE_GRANULES_PER_PAGE); + } +} + +/* + * Check that the tag buffer in test->priv can be compressed and decompres= sed + * without changes. + */ +static void *compress_decompress_helper(struct kunit *test) +{ + struct test_data *td =3D test->priv; + void *handle; + + handle =3D mte_compress(td->tags); + KUNIT_EXPECT_EQ(test, (unsigned long)handle & BIT_ULL(63), 0); + if (handle) { + KUNIT_EXPECT_TRUE(test, mte_decompress(handle, td->dtags)); + KUNIT_EXPECT_EQ(test, memcmp(td->tags, td->dtags, MTE_PAGE_TAG_STORAGE), + 0); + } + return handle; +} + +/* Test that a zero-filled array is compressed into inline storage. */ +static void test_compress_zero(struct kunit *test) +{ + struct test_data *td =3D test->priv; + void *handle; + + memset(td->tags, 0, MTE_PAGE_TAG_STORAGE); + handle =3D compress_decompress_helper(test); + /* Tags are stored inline. */ + KUNIT_EXPECT_TRUE(test, mte_is_compressed(handle)); +} + +/* Test that a 0xaa-filled array is compressed into inline storage. */ +static void test_compress_nonzero(struct kunit *test) +{ + struct test_data *td =3D test->priv; + void *handle; + + memset(td->tags, 0xaa, MTE_PAGE_TAG_STORAGE); + handle =3D compress_decompress_helper(test); + /* Tags are stored inline. */ + KUNIT_EXPECT_TRUE(test, mte_is_compressed(handle)); +} + +/* + * Test that two tag ranges are compressed into inline storage. + * + * This also covers a special case where both ranges contain + * `MTE_GRANULES_PER_PAGE / 2` tags and overflow the designated range size. + */ +static void test_two_ranges(struct kunit *test) +{ + struct test_data *td =3D test->priv; + void *handle; + unsigned int i; + size_t r_len =3D 2; + unsigned char r_tags[2] =3D { 0xe, 0x0 }; + unsigned short r_sizes[2]; + + for (i =3D 1; i < MTE_GRANULES_PER_PAGE; i++) { + r_sizes[0] =3D i; + r_sizes[1] =3D MTE_GRANULES_PER_PAGE - i; + mte_ranges_to_tags(r_tags, r_sizes, r_len, td->tags); + handle =3D compress_decompress_helper(test); + KUNIT_EXPECT_TRUE(test, mte_is_compressed(handle)); + } +} + +/* + * Test that a very small number of tag ranges ends up compressed into 8 b= ytes. + */ +static void test_compress_simple(struct kunit *test) +{ + struct test_data *td =3D test->priv; + void *handle; + + memset(td->tags, 0, MTE_PAGE_TAG_STORAGE); + td->tags[0] =3D 0xa0; + td->tags[1] =3D 0x0a; + + handle =3D compress_decompress_helper(test); + /* Tags are stored inline. */ + KUNIT_EXPECT_TRUE(test, mte_is_compressed(handle)); +} + +/* + * Test that a buffer containing @nranges ranges compresses into @exp_size + * bytes and decompresses into the original tag sequence. + */ +static void compress_range_helper(struct kunit *test, int nranges, + bool exp_inl) +{ + struct test_data *td =3D test->priv; + void *handle; + + gen_tag_range_helper(td->tags, nranges); + handle =3D compress_decompress_helper(test); + KUNIT_EXPECT_EQ(test, mte_is_compressed(handle), exp_inl); +} + +static inline size_t max_inline_ranges(void) +{ +#if defined CONFIG_ARM64_4K_PAGES + return 6; +#elif defined(CONFIG_ARM64_16K_PAGES) + return 5; +#else + return 4; +#endif +} + +/* + * Test that every number of tag ranges is correctly compressed and + * decompressed. + */ +static void test_compress_ranges(struct kunit *test) +{ + unsigned int i; + bool exp_inl; + + for (i =3D 1; i <=3D MTE_GRANULES_PER_PAGE; i++) { + exp_inl =3D i <=3D max_inline_ranges(); + compress_range_helper(test, i, exp_inl); + } +} + +/* + * Test that invalid handles are ignored by mte_decompress(). + */ +static void test_decompress_invalid(struct kunit *test) +{ + void *handle1 =3D (void *)0xeb0b0b0100804020; + void *handle2 =3D (void *)0x6b0b0b010080402f; + struct test_data *td =3D test->priv; + + /* handle1 has bit 0 set to 1. */ + KUNIT_EXPECT_FALSE(test, mte_decompress(handle1, td->dtags)); + /* + * handle2 is an inline handle, but its largest_idx (bits 1..3) + * is out of bounds for the inline storage. + */ + KUNIT_EXPECT_FALSE(test, mte_decompress(handle2, td->dtags)); +} + +/* + * Test that compressed inline tags cannot be confused with out-of-line + * pointers. + * + * Compressed values are written from bit 0 to bit 63, so the size of the = last + * tag range initially ends up in the upper bits of the inline representat= ion. + * Make sure mte_compress() rearranges the bits so that the resulting hand= le does + * not have 0b0111 as the upper four bits. + */ +static void test_upper_bits(struct kunit *test) +{ + struct test_data *td =3D test->priv; + void *handle; + unsigned char r_tags[6] =3D { 7, 0, 7, 0, 7, 0 }; + unsigned short r_sizes[6] =3D { 1, 1, 1, 1, 1, 1 }; + size_t r_len; + + /* Maximum number of ranges that can be encoded inline. */ + r_len =3D max_inline_ranges(); + /* Maximum range size possible, will be omitted. */ + r_sizes[0] =3D MTE_GRANULES_PER_PAGE / 2 - 1; + /* A number close to r_sizes[0] that has most of its bits set. */ + r_sizes[r_len - 1] =3D MTE_GRANULES_PER_PAGE - r_sizes[0] - r_len + 2; + + mte_ranges_to_tags(r_tags, r_sizes, r_len, td->tags); + handle =3D compress_decompress_helper(test); + KUNIT_EXPECT_TRUE(test, mte_is_compressed(handle)); +} + +static void mtecomp_dealloc_testdata(struct test_data *td) +{ + kfree(td->tags); + kfree(td->dtags); + kfree(td->r_sizes); + kfree(td->r_tags); +} + +static int mtecomp_test_init(struct kunit *test) +{ + struct test_data *td; + + td =3D kmalloc(sizeof(struct test_data), GFP_KERNEL); + if (!td) + return 1; + td->tags =3D kmalloc(MTE_PAGE_TAG_STORAGE, GFP_KERNEL); + if (!td->tags) + goto error; + td->dtags =3D kmalloc(MTE_PAGE_TAG_STORAGE, GFP_KERNEL); + if (!td->dtags) + goto error; + td->r_len =3D MTE_GRANULES_PER_PAGE; + td->r_sizes =3D kmalloc_array(MTE_GRANULES_PER_PAGE, + sizeof(unsigned short), GFP_KERNEL); + if (!td->r_sizes) + goto error; + td->r_tags =3D kmalloc(MTE_GRANULES_PER_PAGE, GFP_KERNEL); + if (!td->r_tags) + goto error; + test->priv =3D (void *)td; + return 0; +error: + mtecomp_dealloc_testdata(td); + return 1; +} + +static void mtecomp_test_exit(struct kunit *test) +{ + struct test_data *td =3D test->priv; + + mtecomp_dealloc_testdata(td); +} + +static struct kunit_case mtecomp_test_cases[] =3D { + KUNIT_CASE(test_tags_to_ranges_zero), + KUNIT_CASE(test_tags_to_ranges_simple), + KUNIT_CASE(test_tags_to_ranges_repeated), + KUNIT_CASE(test_tag_to_ranges_n), + KUNIT_CASE(test_compress_zero), + KUNIT_CASE(test_compress_nonzero), + KUNIT_CASE(test_two_ranges), + KUNIT_CASE(test_compress_simple), + KUNIT_CASE(test_compress_ranges), + KUNIT_CASE(test_decompress_invalid), + KUNIT_CASE(test_upper_bits), + {} +}; + +static struct kunit_suite mtecomp_test_suite =3D { + .name =3D "mtecomp", + .init =3D mtecomp_test_init, + .exit =3D mtecomp_test_exit, + .test_cases =3D mtecomp_test_cases, +}; +kunit_test_suites(&mtecomp_test_suite); + +MODULE_IMPORT_NS(MTECOMP); +MODULE_LICENSE("GPL"); +MODULE_AUTHOR("Alexander Potapenko "); --=20 2.43.0.472.g3155946c3a-goog From nobody Sat Dec 27 17:02:36 2025 Received: from mail-ed1-f73.google.com (mail-ed1-f73.google.com [209.85.208.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A943280DF5 for ; Mon, 18 Dec 2023 12:40:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--glider.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="XQ2PAozi" Received: by mail-ed1-f73.google.com with SMTP id 4fb4d7f45d1cf-55330b01be0so644568a12.2 for ; Mon, 18 Dec 2023 04:40:56 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1702903255; x=1703508055; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=qETeNVW4T3lC++QM3orD9LTmmtp3iyUET5aUKhJnmac=; b=XQ2PAozi99uKDNjgZmOtT7chjQ1fFblS0lXkNUkPH6jZsHeGpfVXICRcSTX5EGfpkI svBipq0FRP+uQdXQCF32tREO5KRQDMmc4EchX8oIz/i32Mesz8zmLAn4bCDbfnu5uy0A vodMXL1m+4HF4mBPVJXwxo74KV4qwh9FA9jqjKbHdF51oPt1NUj9UiApPUO3r3mhdxzI c4qvodLOPfdUDji8rGmMpYAaBNDcjFx2fjoGJcdf1QGsHCsLNxnG9MpgB3oLc7rvY3rC vumaRyKp38iTfwpPpqb2Fp7KpDLaQpoS1l6JfuH4czRXvmvn6ly4WuIBV5Ysixbx8lOY 5hKA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1702903255; x=1703508055; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=qETeNVW4T3lC++QM3orD9LTmmtp3iyUET5aUKhJnmac=; b=R3tYEGv1/y/obAUSHoJRfwwOXqHAZ/y+moldMy/aSSKWr6zBUiv1j24e00h5rzHza0 V3DzvULljaIIZ+G0O+3JCYmEL9YC5tIdktCQOPgNBAj7lr+3I9plMPv/dMwfAjWsVSVQ l31YJm0WyrVZ4L0HQE3Hw5NDEea6thnAf95nCeYthuJf0MJUDso/moo15/8kqMvnhAeQ bfhkTpXnb4MXbKXRwlWgml23cVyYeRVwDo1P0AsdwyIcgtBCDb/x7ulvrbVx7KIHpPbN 8gXvWSmNnoo+IBlKsZNuc5PxbW4rwGYQxRQGwTIug+shj5Y8ourtJOtM/b8RmyXmy08p dZyA== X-Gm-Message-State: AOJu0Ywsgw2pVryHq5ej37mYN/JIKq3yTE/+AgAiWpMtYDtw9hR3wTTG o+raXVDkJON0D3YpPkl6PwrEhbl8WD0= X-Google-Smtp-Source: AGHT+IEY9hmIS94WMO/1gQI8OpqW8L6q4Ui68LjYFEltyuc+bpHhQQF2yLnqqce6Ij7CvGAUC6mng50r1nM= X-Received: from glider.muc.corp.google.com ([2a00:79e0:9c:201:9a3f:c8e7:1395:806e]) (user=glider job=sendgmr) by 2002:a50:ed8c:0:b0:553:7dac:75cf with SMTP id h12-20020a50ed8c000000b005537dac75cfmr256edr.3.1702903254957; Mon, 18 Dec 2023 04:40:54 -0800 (PST) Date: Mon, 18 Dec 2023 13:40:32 +0100 In-Reply-To: <20231218124033.551770-1-glider@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20231218124033.551770-1-glider@google.com> X-Mailer: git-send-email 2.43.0.472.g3155946c3a-goog Message-ID: <20231218124033.551770-7-glider@google.com> Subject: [PATCH v11-mte 6/7] arm64: mte: add compression support to mteswap.c From: Alexander Potapenko To: glider@google.com, catalin.marinas@arm.com, will@kernel.org, pcc@google.com, andreyknvl@gmail.com, andriy.shevchenko@linux.intel.com, aleksander.lobakin@intel.com, linux@rasmusvillemoes.dk, yury.norov@gmail.com, alexandru.elisei@arm.com Cc: linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, eugenis@google.com, syednwaris@gmail.com, william.gray@linaro.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Update mteswap.c to perform inline compression of memory tags when possible. If CONFIG_ARM64_MTE_COMP is enabled, mteswap.c will attempt to compress saved tags for a struct page and store them directly in Xarray entry instead of wasting heap space. Soon after booting Android, tag compression saves ~2x memory previously spent by mteswap.c on tag allocations. On a moderately loaded device with ~20% tagged pages, this leads to saving several megabytes of kernel heap: # cat /sys/kernel/debug/mteswap/stats 8 bytes: 102496 allocations, 67302 deallocations 128 bytes: 212234 allocations, 178278 deallocations uncompressed tag storage size: 8851200 compressed tag storage size: 4346368 (statistics collection is introduced in the following patch) Signed-off-by: Alexander Potapenko Reviewed-by: Catalin Marinas --- v10-mte: - added Catalin's Reviewed-by: v9: - as requested by Yury Norov, split off statistics collection into a separate patch - minor fixes v8: - adapt to the new compression API, abandon mteswap_{no,}comp.c - move stats collection to mteswap.c v5: - drop a dead variable from _mte_free_saved_tags() in mteswap_comp.c - ensure MTE compression works with arbitrary page sizes - update patch description v4: - minor code simplifications suggested by Andy Shevchenko, added missing header dependencies - changed compression API names to reflect modifications made to memcomp.h (as suggested by Yury Norov) v3: - Addressed comments by Andy Shevchenko in another patch: - fixed includes order - replaced u64 with unsigned long - added MODULE_IMPORT_NS(MTECOMP) --- arch/arm64/mm/mteswap.c | 19 +++++++++++++++++-- 1 file changed, 17 insertions(+), 2 deletions(-) diff --git a/arch/arm64/mm/mteswap.c b/arch/arm64/mm/mteswap.c index a31833e3ddc54..70f5c8ecd640d 100644 --- a/arch/arm64/mm/mteswap.c +++ b/arch/arm64/mm/mteswap.c @@ -6,6 +6,8 @@ #include #include #include +#include +#include "mtecomp.h" =20 static DEFINE_XARRAY(mte_pages); =20 @@ -17,12 +19,13 @@ void *mte_allocate_tag_storage(void) =20 void mte_free_tag_storage(char *storage) { - kfree(storage); + if (!mte_is_compressed(storage)) + kfree(storage); } =20 int mte_save_tags(struct page *page) { - void *tag_storage, *ret; + void *tag_storage, *compressed_storage, *ret; =20 if (!page_mte_tagged(page)) return 0; @@ -32,6 +35,11 @@ int mte_save_tags(struct page *page) return -ENOMEM; =20 mte_save_page_tags(page_address(page), tag_storage); + compressed_storage =3D mte_compress(tag_storage); + if (compressed_storage) { + mte_free_tag_storage(tag_storage); + tag_storage =3D compressed_storage; + } =20 /* lookup the swap entry.val from the page */ ret =3D xa_store(&mte_pages, page_swap_entry(page).val, tag_storage, @@ -50,13 +58,20 @@ int mte_save_tags(struct page *page) void mte_restore_tags(swp_entry_t entry, struct page *page) { void *tags =3D xa_load(&mte_pages, entry.val); + void *tag_storage =3D NULL; =20 if (!tags) return; =20 if (try_page_mte_tagging(page)) { + if (mte_is_compressed(tags)) { + tag_storage =3D mte_allocate_tag_storage(); + mte_decompress(tags, tag_storage); + tags =3D tag_storage; + } mte_restore_page_tags(page_address(page), tags); set_page_mte_tagged(page); + mte_free_tag_storage(tag_storage); } } =20 --=20 2.43.0.472.g3155946c3a-goog From nobody Sat Dec 27 17:02:36 2025 Received: from mail-ed1-f74.google.com (mail-ed1-f74.google.com [209.85.208.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B0A0480E0B for ; Mon, 18 Dec 2023 12:40:59 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--glider.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="eHkQINzW" Received: by mail-ed1-f74.google.com with SMTP id 4fb4d7f45d1cf-5534b41f529so478276a12.0 for ; Mon, 18 Dec 2023 04:40:59 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1702903258; x=1703508058; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=tbTPsWqaE0zk/zTD6ZnQV+kyVQAmtpdkrOV6ohS4lXU=; b=eHkQINzWdORAI45+Bb7WutxyiritI/YLFdw6Z9ArWl85qNYtIQrw94Jf0bNb2qjG2R cGLBldv6KEqkSdac19G+ckFEeHuR7Z1/u1HLCi0gi+qrz7c5NrQJL9jw9n7+BfMoKLJW /rPlAybna3+FRZeoRDrCWWWbwyxnl5gEvD5FSk0+IEbm7/qX3h3woHQGKNV0tgkGohV2 0mVtoFmbkZ94Al4Jhe9DGZFyH0asEWxRg4PC8oWGe6qyA2vwtJRCLFI292EhzZilQ/my 1zgJT0Sf1GxBUgxrnasH9R6z84bbvSgzBqExAa0XABzSGlEYhbHoztFOKTyFNZ6G5tKZ cs7w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1702903258; x=1703508058; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=tbTPsWqaE0zk/zTD6ZnQV+kyVQAmtpdkrOV6ohS4lXU=; b=tQ+ZZ6VcDtFkujbyV7T3XhNYK+fPycFGcz2Zpm02SrmTCJBRrppMbxSBisHdoRbmqG kw6A2mQkbQYYS73haAB8c+2Fle3zmDQujQuVOHWstIImcOU6KNOb/RTRRk0TmtPFkr/s NNW49DohZ+2HvEQ2pisNJHog1JvPaGSDrfeKZ2t4JQeqHoLdWqWbpSzvJSIjlTeXDPET ZV3kkm44tm95crmVq4HL1R5M8c5BY/lUnV5yIoAAwEvDu7OmyLfo1EMps1KCCuZfPvVR D6uz/uVy9eLXnmW8ouJKDS3JF+qDWVaJ6ditLqfRo+Vu9O/bO0lWapa/BuVXzsvaGAVj 64hw== X-Gm-Message-State: AOJu0Yx+BEHhczkre6B0JyZ3ZQhoP92PQ+cJhCdtwxZbWXMWHIdDHqx5 tNvXzfPSqMEgtd3x6BZqPFtQDWTfM7k= X-Google-Smtp-Source: AGHT+IFMV7VwSrjypK+QQcM8Gkbn6+2AcXQTl/9IjPplGqpiMfDbdEPDzKVag/hy05/X3j0+nq6wM7Yrbe4= X-Received: from glider.muc.corp.google.com ([2a00:79e0:9c:201:9a3f:c8e7:1395:806e]) (user=glider job=sendgmr) by 2002:a50:cc8e:0:b0:54d:15e9:560 with SMTP id q14-20020a50cc8e000000b0054d15e90560mr125342edi.2.1702903257881; Mon, 18 Dec 2023 04:40:57 -0800 (PST) Date: Mon, 18 Dec 2023 13:40:33 +0100 In-Reply-To: <20231218124033.551770-1-glider@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20231218124033.551770-1-glider@google.com> X-Mailer: git-send-email 2.43.0.472.g3155946c3a-goog Message-ID: <20231218124033.551770-8-glider@google.com> Subject: [PATCH v11-mte 7/7] arm64: mte: implement CONFIG_ARM64_MTE_SWAP_STATS From: Alexander Potapenko To: glider@google.com, catalin.marinas@arm.com, will@kernel.org, pcc@google.com, andreyknvl@gmail.com, andriy.shevchenko@linux.intel.com, aleksander.lobakin@intel.com, linux@rasmusvillemoes.dk, yury.norov@gmail.com, alexandru.elisei@arm.com Cc: linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, eugenis@google.com, syednwaris@gmail.com, william.gray@linaro.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Provide a config to collect the usage statistics for ARM MTE tag compression. This patch introduces allocation/deallocation counters for buffers that were stored uncompressed (and thus occupy 128 bytes of heap plus the Xarray overhead to store a pointer) and those that were compressed into 8-byte pointers (effectively using 0 bytes of heap in addition to the Xarray overhead). The counters are exposed to the userspace via /sys/kernel/debug/mteswap/stats: # cat /sys/kernel/debug/mteswap/stats 8 bytes: 102496 allocations, 67302 deallocations 128 bytes: 212234 allocations, 178278 deallocations uncompressed tag storage size: 8851200 compressed tag storage size: 4346368 Suggested-by: Yury Norov Signed-off-by: Alexander Potapenko Acked-by: Catalin Marinas Reviewed-by: Yury Norov --- This patch was split off from the "arm64: mte: add compression support to mteswap.c" patch (https://lore.kernel.org/linux-arm-kernel/ZUVulBKVYK7cq2rJ@yury-ThinkPad/T/= #m819ec30beb9de53d5c442f7e3247456f8966d88a) v11-mte: - add Yury's Reviewed-by: v10-mte: - added Catalin's Acked-by: v9: - add this patch, put the stats behind a separate config, mention /sys/kernel/debug/mteswap/stats in the documentation --- .../arch/arm64/mte-tag-compression.rst | 12 +++ arch/arm64/Kconfig | 15 +++ arch/arm64/mm/mteswap.c | 93 ++++++++++++++++++- 3 files changed, 118 insertions(+), 2 deletions(-) diff --git a/Documentation/arch/arm64/mte-tag-compression.rst b/Documentati= on/arch/arm64/mte-tag-compression.rst index 8fe6b51a9db6d..4c25b96f7d4b5 100644 --- a/Documentation/arch/arm64/mte-tag-compression.rst +++ b/Documentation/arch/arm64/mte-tag-compression.rst @@ -145,6 +145,18 @@ Tag compression and decompression implicitly rely on t= he fixed MTE tag size (4 bits) and number of tags per page. Should these values change, the algo= rithm may need to be revised. =20 +Stats +=3D=3D=3D=3D=3D + +When `CONFIG_ARM64_MTE_SWAP_STATS` is enabled, `arch/arm64/mm/mteswap.c` e= xports +usage statistics for tag compression used when swapping tagged pages. The = data +can be accessed via debugfs:: + + # cat /sys/kernel/debug/mteswap/stats + 8 bytes: 10438 allocations, 10417 deallocations + 128 bytes: 26180 allocations, 26179 deallocations + uncompressed tag storage size: 2816 + compressed tag storage size: 128 =20 Programming Interface =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index 6a1397a96f2f0..49a786c7edadd 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -2100,6 +2100,21 @@ config ARM64_MTE_COMP_KUNIT_TEST be compressed into pointer-size values and correctly decompressed afterwards. =20 +config ARM64_MTE_SWAP_STATS + bool "Collect usage statistics of tag compression for swapped MTE tags" + default y + depends on ARM64_MTE && ARM64_MTE_COMP + help + Collect usage statistics for ARM64 MTE tag compression during swapping. + + Adds allocation/deallocation counters for buffers that were stored + uncompressed (and thus occupy 128 bytes of heap plus the Xarray + overhead to store a pointer) and those that were compressed into + 8-byte pointers (effectively using 0 bytes of heap in addition to + the Xarray overhead). + The counters are exposed to the userspace via + /sys/kernel/debug/mteswap/stats. + config ARM64_SVE bool "ARM Scalable Vector Extension support" default y diff --git a/arch/arm64/mm/mteswap.c b/arch/arm64/mm/mteswap.c index 70f5c8ecd640d..1c6c78b9a9037 100644 --- a/arch/arm64/mm/mteswap.c +++ b/arch/arm64/mm/mteswap.c @@ -1,5 +1,6 @@ // SPDX-License-Identifier: GPL-2.0-only =20 +#include #include #include #include @@ -11,16 +12,54 @@ =20 static DEFINE_XARRAY(mte_pages); =20 +enum mteswap_counters { + MTESWAP_CTR_INLINE =3D 0, + MTESWAP_CTR_OUTLINE, + MTESWAP_CTR_SIZE +}; + +#if defined(CONFIG_ARM64_MTE_SWAP_STATS) +static atomic_long_t alloc_counters[MTESWAP_CTR_SIZE]; +static atomic_long_t dealloc_counters[MTESWAP_CTR_SIZE]; + +static void inc_alloc_counter(int kind) +{ + atomic_long_inc(&alloc_counters[kind]); +} + +static void inc_dealloc_counter(int kind) +{ + atomic_long_inc(&dealloc_counters[kind]); +} +#else +static void inc_alloc_counter(int kind) +{ +} + +static void inc_dealloc_counter(int kind) +{ +} +#endif + void *mte_allocate_tag_storage(void) { + void *ret; + /* tags granule is 16 bytes, 2 tags stored per byte */ - return kmalloc(MTE_PAGE_TAG_STORAGE, GFP_KERNEL); + ret =3D kmalloc(MTE_PAGE_TAG_STORAGE, GFP_KERNEL); + if (ret) + inc_alloc_counter(MTESWAP_CTR_OUTLINE); + return ret; } =20 void mte_free_tag_storage(char *storage) { - if (!mte_is_compressed(storage)) + if (!mte_is_compressed(storage)) { kfree(storage); + inc_dealloc_counter(MTESWAP_CTR_OUTLINE); + } else { + inc_dealloc_counter(MTESWAP_CTR_INLINE); + } } =20 int mte_save_tags(struct page *page) @@ -39,6 +78,7 @@ int mte_save_tags(struct page *page) if (compressed_storage) { mte_free_tag_storage(tag_storage); tag_storage =3D compressed_storage; + inc_alloc_counter(MTESWAP_CTR_INLINE); } =20 /* lookup the swap entry.val from the page */ @@ -98,3 +138,52 @@ void mte_invalidate_tags_area(int type) } xa_unlock(&mte_pages); } + +#if defined(CONFIG_ARM64_MTE_SWAP_STATS) +/* DebugFS interface. */ +static int stats_show(struct seq_file *seq, void *v) +{ + unsigned long total_mem_alloc =3D 0, total_mem_dealloc =3D 0; + unsigned long total_num_alloc =3D 0, total_num_dealloc =3D 0; + unsigned long sizes[2] =3D { 8, MTE_PAGE_TAG_STORAGE }; + long alloc, dealloc; + unsigned long size; + int i; + + for (i =3D 0; i < MTESWAP_CTR_SIZE; i++) { + alloc =3D atomic_long_read(&alloc_counters[i]); + dealloc =3D atomic_long_read(&dealloc_counters[i]); + total_num_alloc +=3D alloc; + total_num_dealloc +=3D dealloc; + size =3D sizes[i]; + /* + * Do not count 8-byte buffers towards compressed tag storage + * size. + */ + if (i) { + total_mem_alloc +=3D (size * alloc); + total_mem_dealloc +=3D (size * dealloc); + } + seq_printf(seq, + "%lu bytes:\t%lu allocations,\t%lu deallocations\n", + size, alloc, dealloc); + } + seq_printf(seq, "uncompressed tag storage size:\t%lu\n", + (total_num_alloc - total_num_dealloc) * + MTE_PAGE_TAG_STORAGE); + seq_printf(seq, "compressed tag storage size:\t%lu\n", + total_mem_alloc - total_mem_dealloc); + return 0; +} +DEFINE_SHOW_ATTRIBUTE(stats); + +static int mteswap_init(void) +{ + struct dentry *mteswap_dir; + + mteswap_dir =3D debugfs_create_dir("mteswap", NULL); + debugfs_create_file("stats", 0444, mteswap_dir, NULL, &stats_fops); + return 0; +} +module_init(mteswap_init); +#endif --=20 2.43.0.472.g3155946c3a-goog