From nobody Sun Dec 28 00:48:26 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2291EC4332F for ; Thu, 14 Dec 2023 11:06:58 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1443924AbjLNLGs (ORCPT ); Thu, 14 Dec 2023 06:06:48 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42638 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1443916AbjLNLGo (ORCPT ); Thu, 14 Dec 2023 06:06:44 -0500 Received: from mail-yw1-x1149.google.com (mail-yw1-x1149.google.com [IPv6:2607:f8b0:4864:20::1149]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 05A6D121 for ; Thu, 14 Dec 2023 03:06:50 -0800 (PST) Received: by mail-yw1-x1149.google.com with SMTP id 00721157ae682-5e33a2f0cacso3382427b3.0 for ; Thu, 14 Dec 2023 03:06:49 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1702552009; x=1703156809; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=EyXxVtW+rdQsX3V7pnuyU4+nkx4xXk2D+H4JW0Q7bfM=; b=ORiigBak76Q7vNINF+/57DM7oV0LGo6TmBnYP5jwjHVS0mehV7EjB67KE2bm6bgrUP CKbU0+1oA/r5r69s/hStCi0P2LRmG4cZvgs+rEeN8MyU+S2dRwq4hlUHyiXjqkVQ5k0d 8Rdwi1zZIdemUXjdCVdXX+LMv80jJFSfecbAdy8+4kHZFur1pmicnfl0sqTyvJ9gOLK4 TEfAQkySnMyR9B3pW8Tuu2PEO4ezcBJCABFgG9j2R8LfW5qeOKoL/KxPlUo178g3yjt8 Cze+sNTcHac1gH7eUIbnu+ys75rQJdqMMZJHnxT6u+yhet5S+tphU8fCgh2cEdxYaEbv nwcw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1702552009; x=1703156809; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=EyXxVtW+rdQsX3V7pnuyU4+nkx4xXk2D+H4JW0Q7bfM=; b=HejZxyWQldq0hay/jNCYEJq8PF/0Cn2kkTYabLd56HG3VOvqxaOEITb0BeMF9ZICzn xEDivyD+gbv2n3WLNUtE/2Iyx5rtEy3EEXnVOAKNG21f3roWsWdHMGpu5l6KNuGFzOLB ofXtWTjiS7IAo3YbvYtZ1nYoxbbOSyQoeXScK7ZWL9sYptE+0MQEqNlh3CZI1LVmjc1Q kp0OE/HdQozl8C8KgOq6Z7MqkJpTMrf6skqpE4MEGRdcNalIOK7i6Jz/7TOpjU6O/taz K3eeqLUvxSHjsFp+ej0bZDVjs//Nj7wQqbsOS07wOpI/wFvyYhcg0l8eEQJnTAzTt43/ StpQ== X-Gm-Message-State: AOJu0YzDD7mxKUd1cAymZ4UAvQc3S4qKYSS4UidO4hzZ0WtQW6hFd6md sIxTUWxwuY5aLcrJop5qjFjTPaez+Ic= X-Google-Smtp-Source: AGHT+IGW0HdZBurxxyqI6QmXx4AKMYKZD446PLJHAHdSqHiY2mQ2I9QfmLcwZ4od/25icUASR9vq/iRmuJQ= X-Received: from glider.muc.corp.google.com ([2a00:79e0:9c:201:8447:3e89:1f77:ff8a]) (user=glider job=sendgmr) by 2002:a05:690c:a8c:b0:5d3:e8b8:e1fd with SMTP id ci12-20020a05690c0a8c00b005d3e8b8e1fdmr182658ywb.3.1702552009224; Thu, 14 Dec 2023 03:06:49 -0800 (PST) Date: Thu, 14 Dec 2023 12:06:33 +0100 In-Reply-To: <20231214110639.2294687-1-glider@google.com> Mime-Version: 1.0 References: <20231214110639.2294687-1-glider@google.com> X-Mailer: git-send-email 2.43.0.472.g3155946c3a-goog Message-ID: <20231214110639.2294687-2-glider@google.com> Subject: [PATCH v10-mte 1/7] lib/bitmap: add bitmap_{read,write}() From: Alexander Potapenko To: glider@google.com, catalin.marinas@arm.com, will@kernel.org, pcc@google.com, andreyknvl@gmail.com, andriy.shevchenko@linux.intel.com, aleksander.lobakin@intel.com, linux@rasmusvillemoes.dk, yury.norov@gmail.com, alexandru.elisei@arm.com Cc: linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, eugenis@google.com, syednwaris@gmail.com, william.gray@linaro.org, Arnd Bergmann Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Syed Nayyar Waris The two new functions allow reading/writing values of length up to BITS_PER_LONG bits at arbitrary position in the bitmap. The code was taken from "bitops: Introduce the for_each_set_clump macro" by Syed Nayyar Waris with a number of changes and simplifications: - instead of using roundup(), which adds an unnecessary dependency on , we calculate space as BITS_PER_LONG-offset; - indentation is reduced by not using else-clauses (suggested by checkpatch for bitmap_get_value()); - bitmap_get_value()/bitmap_set_value() are renamed to bitmap_read() and bitmap_write(); - some redundant computations are omitted. Cc: Arnd Bergmann Signed-off-by: Syed Nayyar Waris Signed-off-by: William Breathitt Gray Link: https://lore.kernel.org/lkml/fe12eedf3666f4af5138de0e70b67a07c7f40338= .1592224129.git.syednwaris@gmail.com/ Suggested-by: Yury Norov Co-developed-by: Alexander Potapenko Signed-off-by: Alexander Potapenko Reviewed-by: Andy Shevchenko Acked-by: Yury Norov --- v10-mte: - send this patch together with the "Implement MTE tag compression for swapped pages" Revisions v8-v12 of bitmap patches were reviewed separately from the "Implement MTE tag compression for swapped pages" series (https://lore.kernel.org/lkml/20231109151106.2385155-1-glider@google.com/) This patch was previously called "lib/bitmap: add bitmap_{set,get}_value()" (https://lore.kernel.org/lkml/20230720173956.3674987-2-glider@google.com/) v11: - rearrange whitespace as requested by Andy Shevchenko, add Reviewed-by:, update a comment v10: - update comments as requested by Andy Shevchenko v8: - as suggested by Andy Shevchenko, handle reads/writes of more than BITS_PER_LONG bits, add a note for 32-bit systems v7: - Address comments by Yury Norov, Andy Shevchenko, Rasmus Villemoes: - update code comments; - get rid of GENMASK(); - s/assign_bit/__assign_bit; - more vertical whitespace for better readability; - more compact code for bitmap_write() (now for real) v6: - As suggested by Yury Norov, do not require bitmap_read(..., 0) to return 0. v5: - Address comments by Yury Norov: - updated code comments and patch title/description - replace GENMASK(nbits - 1, 0) with BITMAP_LAST_WORD_MASK(nbits) - more compact bitmap_write() implementation v4: - Address comments by Andy Shevchenko and Yury Norov: - prevent passing values >=3D 64 to GENMASK() - fix commit authorship - change comments - check for unlikely(nbits=3D=3D0) - drop unnecessary const declarations - fix kernel-doc comments - rename bitmap_{get,set}_value() to bitmap_{read,write}() --- include/linux/bitmap.h | 77 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 77 insertions(+) diff --git a/include/linux/bitmap.h b/include/linux/bitmap.h index 99451431e4d65..7ca0379be8c13 100644 --- a/include/linux/bitmap.h +++ b/include/linux/bitmap.h @@ -79,6 +79,10 @@ struct device; * bitmap_to_arr64(buf, src, nbits) Copy nbits from buf to u64= [] dst * bitmap_get_value8(map, start) Get 8bit value from map at= start * bitmap_set_value8(map, value, start) Set 8bit value to map at s= tart + * bitmap_read(map, start, nbits) Read an nbits-sized value = from + * map at start + * bitmap_write(map, value, start, nbits) Write an nbits-sized value= to + * map at start * * Note, bitmap_zero() and bitmap_fill() operate over the region of * unsigned longs, that is, bits behind bitmap till the unsigned long @@ -636,6 +640,79 @@ static inline void bitmap_set_value8(unsigned long *ma= p, unsigned long value, map[index] |=3D value << offset; } =20 +/** + * bitmap_read - read a value of n-bits from the memory region + * @map: address to the bitmap memory region + * @start: bit offset of the n-bit value + * @nbits: size of value in bits, nonzero, up to BITS_PER_LONG + * + * Returns: value of @nbits bits located at the @start bit offset within t= he + * @map memory region. For @nbits =3D 0 and @nbits > BITS_PER_LONG the ret= urn + * value is undefined. + */ +static inline unsigned long bitmap_read(const unsigned long *map, + unsigned long start, + unsigned long nbits) +{ + size_t index =3D BIT_WORD(start); + unsigned long offset =3D start % BITS_PER_LONG; + unsigned long space =3D BITS_PER_LONG - offset; + unsigned long value_low, value_high; + + if (unlikely(!nbits || nbits > BITS_PER_LONG)) + return 0; + + if (space >=3D nbits) + return (map[index] >> offset) & BITMAP_LAST_WORD_MASK(nbits); + + value_low =3D map[index] & BITMAP_FIRST_WORD_MASK(start); + value_high =3D map[index + 1] & BITMAP_LAST_WORD_MASK(start + nbits); + return (value_low >> offset) | (value_high << space); +} + +/** + * bitmap_write - write n-bit value within a memory region + * @map: address to the bitmap memory region + * @value: value to write, clamped to nbits + * @start: bit offset of the n-bit value + * @nbits: size of value in bits, nonzero, up to BITS_PER_LONG. + * + * bitmap_write() behaves as-if implemented as @nbits calls of __assign_bi= t(), + * i.e. bits beyond @nbits are ignored: + * + * for (bit =3D 0; bit < nbits; bit++) + * __assign_bit(start + bit, bitmap, val & BIT(bit)); + * + * For @nbits =3D=3D 0 and @nbits > BITS_PER_LONG no writes are performed. + */ +static inline void bitmap_write(unsigned long *map, unsigned long value, + unsigned long start, unsigned long nbits) +{ + size_t index; + unsigned long offset; + unsigned long space; + unsigned long mask; + bool fit; + + if (unlikely(!nbits || nbits > BITS_PER_LONG)) + return; + + mask =3D BITMAP_LAST_WORD_MASK(nbits); + value &=3D mask; + offset =3D start % BITS_PER_LONG; + space =3D BITS_PER_LONG - offset; + fit =3D space >=3D nbits; + index =3D BIT_WORD(start); + + map[index] &=3D (fit ? (~(mask << offset)) : ~BITMAP_FIRST_WORD_MASK(star= t)); + map[index] |=3D value << offset; + if (fit) + return; + + map[index + 1] &=3D BITMAP_FIRST_WORD_MASK(start + nbits); + map[index + 1] |=3D (value >> space); +} + #endif /* __ASSEMBLY__ */ =20 #endif /* __LINUX_BITMAP_H */ --=20 2.43.0.472.g3155946c3a-goog From nobody Sun Dec 28 00:48:26 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 77403C4332F for ; Thu, 14 Dec 2023 11:07:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1443934AbjLNLGv (ORCPT ); Thu, 14 Dec 2023 06:06:51 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42628 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1443899AbjLNLGr (ORCPT ); Thu, 14 Dec 2023 06:06:47 -0500 Received: from mail-yw1-x1149.google.com (mail-yw1-x1149.google.com [IPv6:2607:f8b0:4864:20::1149]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A995D11D for ; Thu, 14 Dec 2023 03:06:52 -0800 (PST) Received: by mail-yw1-x1149.google.com with SMTP id 00721157ae682-5e302b65cc7so9679187b3.1 for ; Thu, 14 Dec 2023 03:06:52 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1702552012; x=1703156812; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=hls/cSgyQjjQ+ZYtpE7D/6PwXwlUmCuSXktuDqFik/0=; b=14yvUkqnCTSNuCHpxTDADQvs1xXDNEudl2Qnq8RlTFoxV1nsja3YC7pACJvOcDUyKm lT0PTDx0uMUCP8nxwgrpu9NLBFBqmPoZw0J3ldd0M3jDAM+L0DFtMijesR4y/MT/osmj GFwyLXZiHOUaT2c1pXJXZ0kyYgDgM/nRGd+Eorm/ohgRMn+4eHlRUd74oJ3t8/8XHkyW u+nvEiKcYmjvUcraE8adwTtvZF+W9QMSLW4PZba1roHfqe7Pom8GDHOd96r6gCHPlwiD NaMcJ+I4Sd/SIdhVnFUOCHZ5aS6pu3RLlWDQ68Zt/YxF6XXkos/UEQVl+xDWIVpXzMqp REYQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1702552012; x=1703156812; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=hls/cSgyQjjQ+ZYtpE7D/6PwXwlUmCuSXktuDqFik/0=; b=sHFKobGW2qR1A6MU7nkol5u4oX6RZIXJr1AoSyh0XQejjg/o+MHXeghD1eTrOw/GW7 wkfMPrX0fFYsFrriXCDESi8b6+uOeWHrW5ZFjfHTB+YJ561cFURSA7/bcXOnxqD7/UqY +0N5T+B6oChu55BH+lIxbyNtTP/kq+1BZggKeRMHlqNXrcFL8s5b4DXT1rrBLQGD3qTj bsSLYzWcuPtNjA/FkvXpAXwpmEvXBPeNre8sfZTKOPTVwPf6EWH9Pc4FNhzc68bnzutH y8i+OQ73EHSBWgv9yzdJHWgKwdCqKxDyBS+Y5BsP9s530p9/hU3qyliQOYr6y1s7CA60 nSog== X-Gm-Message-State: AOJu0YxQIEIwjOKUtIYlRjui4ArvhInFFmGmSNomIqvB7ssL3Wf2WXbt YpwMcXNHxTFY0cLFMdeVmgvAW7EJdrs= X-Google-Smtp-Source: AGHT+IEjpQMwFT68NRnvlcTcCrNZBvxWm62UXqwNPKA1jIGJeOIM5XLelcRGDEhW83hxKme6Xq8vC7rpCok= X-Received: from glider.muc.corp.google.com ([2a00:79e0:9c:201:8447:3e89:1f77:ff8a]) (user=glider job=sendgmr) by 2002:a81:fe05:0:b0:5e3:5f02:360a with SMTP id j5-20020a81fe05000000b005e35f02360amr16012ywn.9.1702552011910; Thu, 14 Dec 2023 03:06:51 -0800 (PST) Date: Thu, 14 Dec 2023 12:06:34 +0100 In-Reply-To: <20231214110639.2294687-1-glider@google.com> Mime-Version: 1.0 References: <20231214110639.2294687-1-glider@google.com> X-Mailer: git-send-email 2.43.0.472.g3155946c3a-goog Message-ID: <20231214110639.2294687-3-glider@google.com> Subject: [PATCH v10-mte 2/7] lib/test_bitmap: add tests for bitmap_{read,write}() From: Alexander Potapenko To: glider@google.com, catalin.marinas@arm.com, will@kernel.org, pcc@google.com, andreyknvl@gmail.com, andriy.shevchenko@linux.intel.com, aleksander.lobakin@intel.com, linux@rasmusvillemoes.dk, yury.norov@gmail.com, alexandru.elisei@arm.com Cc: linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, eugenis@google.com, syednwaris@gmail.com, william.gray@linaro.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add basic tests ensuring that values can be added at arbitrary positions of the bitmap, including those spanning into the adjacent unsigned longs. Two new performance tests, test_bitmap_read_perf() and test_bitmap_write_perf(), can be used to assess future performance improvements of bitmap_read() and bitmap_write(): [ 0.431119][ T1] test_bitmap: Time spent in test_bitmap_read_perf: 61= 5253 [ 0.433197][ T1] test_bitmap: Time spent in test_bitmap_write_perf: 9= 16313 (numbers from a Intel(R) Xeon(R) Gold 6154 CPU @ 3.00GHz machine running QEMU). Signed-off-by: Alexander Potapenko Reviewed-by: Andy Shevchenko Acked-by: Yury Norov --- v10-mte: - send this patch together with the "Implement MTE tag compression for swapped pages" Revisions v8-v12 of bitmap patches were reviewed separately from the "Implement MTE tag compression for swapped pages" series (https://lore.kernel.org/lkml/20231109151106.2385155-1-glider@google.com/) This patch was previously called "lib/test_bitmap: add tests for bitmap_{set,get}_value()" (https://lore.kernel.org/lkml/20230720173956.3674987-3-glider@google.com/) and "lib/test_bitmap: add tests for bitmap_{set,get}_value_unaligned" (https://lore.kernel.org/lkml/20230713125706.2884502-3-glider@google.com/) v12: - as suggested by Alexander Lobakin, replace expect_eq_uint() with expect_eq_ulong() and a cast v9: - use WRITE_ONCE() to prevent optimizations in test_bitmap_read_perf() - update patch description v8: - as requested by Andy Shevchenko, add tests for reading/writing sizes > BITS_PER_LONG v7: - as requested by Yury Norov, add performance tests for bitmap_read() and bitmap_write() v6: - use bitmap API to initialize test bitmaps - as requested by Yury Norov, do not check the return value of bitmap_read(..., 0) - fix a compiler warning on 32-bit systems v5: - update patch title - address Yury Norov's comments: - rename the test cases - factor out test_bitmap_write_helper() to test writing over different background patterns; - add a test case copying a nontrivial value bit-by-bit; - drop volatile v4: - Address comments by Andy Shevchenko: added Reviewed-by: and a link to the previous discussion - Address comments by Yury Norov: - expand the bitmap to catch more corner cases - add code testing that bitmap_set_value() does not touch adjacent bits - add code testing the nbits=3D=3D0 case - rename bitmap_{get,set}_value() to bitmap_{read,write}() v3: - switch to using bitmap_{set,get}_value() - change the expected bit pattern in test_set_get_value(), as the test was incorrectly assuming 0 is the LSB. --- lib/test_bitmap.c | 179 ++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 172 insertions(+), 7 deletions(-) diff --git a/lib/test_bitmap.c b/lib/test_bitmap.c index 65f22c2578b06..46c0154680772 100644 --- a/lib/test_bitmap.c +++ b/lib/test_bitmap.c @@ -60,18 +60,17 @@ static const unsigned long exp3_1_0[] __initconst =3D { }; =20 static bool __init -__check_eq_uint(const char *srcfile, unsigned int line, - const unsigned int exp_uint, unsigned int x) +__check_eq_ulong(const char *srcfile, unsigned int line, + const unsigned long exp_ulong, unsigned long x) { - if (exp_uint !=3D x) { - pr_err("[%s:%u] expected %u, got %u\n", - srcfile, line, exp_uint, x); + if (exp_ulong !=3D x) { + pr_err("[%s:%u] expected %lu, got %lu\n", + srcfile, line, exp_ulong, x); return false; } return true; } =20 - static bool __init __check_eq_bitmap(const char *srcfile, unsigned int line, const unsigned long *exp_bmap, const unsigned long *bmap, @@ -185,7 +184,8 @@ __check_eq_str(const char *srcfile, unsigned int line, result; \ }) =20 -#define expect_eq_uint(...) __expect_eq(uint, ##__VA_ARGS__) +#define expect_eq_ulong(...) __expect_eq(ulong, ##__VA_ARGS__) +#define expect_eq_uint(x, y) expect_eq_ulong((unsigned int)(x), (unsigned= int)(y)) #define expect_eq_bitmap(...) __expect_eq(bitmap, ##__VA_ARGS__) #define expect_eq_pbl(...) __expect_eq(pbl, ##__VA_ARGS__) #define expect_eq_u32_array(...) __expect_eq(u32_array, ##__VA_ARGS__) @@ -1245,6 +1245,168 @@ static void __init test_bitmap_const_eval(void) BUILD_BUG_ON(~var !=3D ~BIT(25)); } =20 +/* + * Test bitmap should be big enough to include the cases when start is not= in + * the first word, and start+nbits lands in the following word. + */ +#define TEST_BIT_LEN (1000) + +/* + * Helper function to test bitmap_write() overwriting the chosen byte patt= ern. + */ +static void __init test_bitmap_write_helper(const char *pattern) +{ + DECLARE_BITMAP(bitmap, TEST_BIT_LEN); + DECLARE_BITMAP(exp_bitmap, TEST_BIT_LEN); + DECLARE_BITMAP(pat_bitmap, TEST_BIT_LEN); + unsigned long w, r, bit; + int i, n, nbits; + + /* + * Only parse the pattern once and store the result in the intermediate + * bitmap. + */ + bitmap_parselist(pattern, pat_bitmap, TEST_BIT_LEN); + + /* + * Check that writing a single bit does not accidentally touch the + * adjacent bits. + */ + for (i =3D 0; i < TEST_BIT_LEN; i++) { + bitmap_copy(bitmap, pat_bitmap, TEST_BIT_LEN); + bitmap_copy(exp_bitmap, pat_bitmap, TEST_BIT_LEN); + for (bit =3D 0; bit <=3D 1; bit++) { + bitmap_write(bitmap, bit, i, 1); + __assign_bit(i, exp_bitmap, bit); + expect_eq_bitmap(exp_bitmap, bitmap, + TEST_BIT_LEN); + } + } + + /* Ensure writing 0 bits does not change anything. */ + bitmap_copy(bitmap, pat_bitmap, TEST_BIT_LEN); + bitmap_copy(exp_bitmap, pat_bitmap, TEST_BIT_LEN); + for (i =3D 0; i < TEST_BIT_LEN; i++) { + bitmap_write(bitmap, ~0UL, i, 0); + expect_eq_bitmap(exp_bitmap, bitmap, TEST_BIT_LEN); + } + + for (nbits =3D BITS_PER_LONG; nbits >=3D 1; nbits--) { + w =3D IS_ENABLED(CONFIG_64BIT) ? 0xdeadbeefdeadbeefUL + : 0xdeadbeefUL; + w >>=3D (BITS_PER_LONG - nbits); + for (i =3D 0; i <=3D TEST_BIT_LEN - nbits; i++) { + bitmap_copy(bitmap, pat_bitmap, TEST_BIT_LEN); + bitmap_copy(exp_bitmap, pat_bitmap, TEST_BIT_LEN); + for (n =3D 0; n < nbits; n++) + __assign_bit(i + n, exp_bitmap, w & BIT(n)); + bitmap_write(bitmap, w, i, nbits); + expect_eq_bitmap(exp_bitmap, bitmap, TEST_BIT_LEN); + r =3D bitmap_read(bitmap, i, nbits); + expect_eq_ulong(r, w); + } + } +} + +static void __init test_bitmap_read_write(void) +{ + unsigned char *pattern[3] =3D {"", "all:1/2", "all"}; + DECLARE_BITMAP(bitmap, TEST_BIT_LEN); + unsigned long zero_bits =3D 0, bits_per_long =3D BITS_PER_LONG; + unsigned long val; + int i, pi; + + /* + * Reading/writing zero bits should not crash the kernel. + * READ_ONCE() prevents constant folding. + */ + bitmap_write(NULL, 0, 0, READ_ONCE(zero_bits)); + /* Return value of bitmap_read() is undefined here. */ + bitmap_read(NULL, 0, READ_ONCE(zero_bits)); + + /* + * Reading/writing more than BITS_PER_LONG bits should not crash the + * kernel. READ_ONCE() prevents constant folding. + */ + bitmap_write(NULL, 0, 0, READ_ONCE(bits_per_long) + 1); + /* Return value of bitmap_read() is undefined here. */ + bitmap_read(NULL, 0, READ_ONCE(bits_per_long) + 1); + + /* + * Ensure that bitmap_read() reads the same value that was previously + * written, and two consequent values are correctly merged. + * The resulting bit pattern is asymmetric to rule out possible issues + * with bit numeration order. + */ + for (i =3D 0; i < TEST_BIT_LEN - 7; i++) { + bitmap_zero(bitmap, TEST_BIT_LEN); + + bitmap_write(bitmap, 0b10101UL, i, 5); + val =3D bitmap_read(bitmap, i, 5); + expect_eq_ulong(0b10101UL, val); + + bitmap_write(bitmap, 0b101UL, i + 5, 3); + val =3D bitmap_read(bitmap, i + 5, 3); + expect_eq_ulong(0b101UL, val); + + val =3D bitmap_read(bitmap, i, 8); + expect_eq_ulong(0b10110101UL, val); + } + + for (pi =3D 0; pi < ARRAY_SIZE(pattern); pi++) + test_bitmap_write_helper(pattern[pi]); +} + +static void __init test_bitmap_read_perf(void) +{ + DECLARE_BITMAP(bitmap, TEST_BIT_LEN); + unsigned int cnt, nbits, i; + unsigned long val; + ktime_t time; + + bitmap_fill(bitmap, TEST_BIT_LEN); + time =3D ktime_get(); + for (cnt =3D 0; cnt < 5; cnt++) { + for (nbits =3D 1; nbits <=3D BITS_PER_LONG; nbits++) { + for (i =3D 0; i < TEST_BIT_LEN; i++) { + if (i + nbits > TEST_BIT_LEN) + break; + /* + * Prevent the compiler from optimizing away the + * bitmap_read() by using its value. + */ + WRITE_ONCE(val, bitmap_read(bitmap, i, nbits)); + } + } + } + time =3D ktime_get() - time; + pr_err("Time spent in %s:\t%llu\n", __func__, time); +} + +static void __init test_bitmap_write_perf(void) +{ + DECLARE_BITMAP(bitmap, TEST_BIT_LEN); + unsigned int cnt, nbits, i; + unsigned long val =3D 0xfeedface; + ktime_t time; + + bitmap_zero(bitmap, TEST_BIT_LEN); + time =3D ktime_get(); + for (cnt =3D 0; cnt < 5; cnt++) { + for (nbits =3D 1; nbits <=3D BITS_PER_LONG; nbits++) { + for (i =3D 0; i < TEST_BIT_LEN; i++) { + if (i + nbits > TEST_BIT_LEN) + break; + bitmap_write(bitmap, val, i, nbits); + } + } + } + time =3D ktime_get() - time; + pr_err("Time spent in %s:\t%llu\n", __func__, time); +} + +#undef TEST_BIT_LEN + static void __init selftest(void) { test_zero_clear(); @@ -1261,6 +1423,9 @@ static void __init selftest(void) test_bitmap_cut(); test_bitmap_print_buf(); test_bitmap_const_eval(); + test_bitmap_read_write(); + test_bitmap_read_perf(); + test_bitmap_write_perf(); =20 test_find_nth_bit(); test_for_each_set_bit(); --=20 2.43.0.472.g3155946c3a-goog From nobody Sun Dec 28 00:48:26 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 77CCDC4332F for ; Thu, 14 Dec 2023 11:07:15 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1443930AbjLNLHG (ORCPT ); Thu, 14 Dec 2023 06:07:06 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42754 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1443946AbjLNLGu (ORCPT ); Thu, 14 Dec 2023 06:06:50 -0500 Received: from mail-yw1-x1149.google.com (mail-yw1-x1149.google.com [IPv6:2607:f8b0:4864:20::1149]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5CB3312C for ; Thu, 14 Dec 2023 03:06:55 -0800 (PST) Received: by mail-yw1-x1149.google.com with SMTP id 00721157ae682-5d0c4ba7081so95229457b3.0 for ; Thu, 14 Dec 2023 03:06:55 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1702552014; x=1703156814; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=g2GSxzhcZ07xCv7d1L9tozvjfnh/gD6wLbTwZXMSaKU=; b=qLlgVyYJDfMBdwGquZ1gI1iTSviYf7C6ZOYCw55bDuN8FhB+JVy/qYH98RHoncXgKN J22i9SLmfoD2LCCtagPlNKq4BKEuSpyjhWxOtXaOj5sDwYgFQJr+tW/XmmNhu3/x78KB JQQxhX5/lu4FIxEoKUwTNeauHM5POj97zctAqfPWhoIvMIL9X9U0l2/AxmUSqG+gB2pl hecKTuE68x8JhWGLfRvfKJFPGHiMQfzkVpgehGvhoufpTk2S7n3Q+ueBnJoEMaY7f1tQ 5eZucf4uoJPRZI5Tth2b4PDXGpCuxL4RR9YCv6qSEl0Hz8IMawqeVF1gfdEj2aoh9msk PHLQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1702552014; x=1703156814; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=g2GSxzhcZ07xCv7d1L9tozvjfnh/gD6wLbTwZXMSaKU=; b=wqV0RYrpqquVEjCg5IzKrI+JpTizdvkTxc59m1Ektt24HMnYnU37Df4ONh4UoNWH8C isbyHM+QNJtPubspMWuFJM9ymRJ+KQOv6sUd6QA2e4gWqBxWUIZuW6ZZV6gpQQMHGDAW pJwYtD5AFWoMLfQlqBYTugU5ck68vjcwiIuDt9jeJ1zI8YkB9Or/Hh1ELX2dJc4IW4Af lTvp/lDm8ee3Xguq0lzjIuD1zciChSDJajx236jkfY2uoQvoAq8vDr4KZDMTnAR10oJs nwyGADXT6W7W/KgRAL8j1RrzH4B6rtfVCX6f1JWGseiGfxEDe6yWuNXETJjmrg8laI7Z JyYQ== X-Gm-Message-State: AOJu0YwnCZeGLhNaY8NiJPibIaYEboFkpZJZcWkAr+36+ye24VD1rKhq zfk4DtW5ts/Z6TAaAL/hfXiK5jaUdas= X-Google-Smtp-Source: AGHT+IGGkSzNTGa/Zh9JtgjWjYaXHf5NHT9/GP0mNOA+eSkSEyP37yRz72mdweuJoDQ9Wlcg+oAYfOGLUZc= X-Received: from glider.muc.corp.google.com ([2a00:79e0:9c:201:8447:3e89:1f77:ff8a]) (user=glider job=sendgmr) by 2002:a05:6902:1345:b0:db7:e75c:24c1 with SMTP id g5-20020a056902134500b00db7e75c24c1mr6642ybu.9.1702552014550; Thu, 14 Dec 2023 03:06:54 -0800 (PST) Date: Thu, 14 Dec 2023 12:06:35 +0100 In-Reply-To: <20231214110639.2294687-1-glider@google.com> Mime-Version: 1.0 References: <20231214110639.2294687-1-glider@google.com> X-Mailer: git-send-email 2.43.0.472.g3155946c3a-goog Message-ID: <20231214110639.2294687-4-glider@google.com> Subject: [PATCH v10-mte 3/7] lib/test_bitmap: use pr_info() for non-error messages From: Alexander Potapenko To: glider@google.com, catalin.marinas@arm.com, will@kernel.org, pcc@google.com, andreyknvl@gmail.com, andriy.shevchenko@linux.intel.com, aleksander.lobakin@intel.com, linux@rasmusvillemoes.dk, yury.norov@gmail.com, alexandru.elisei@arm.com Cc: linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, eugenis@google.com, syednwaris@gmail.com, william.gray@linaro.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" pr_err() messages may be treated as errors by some log readers, so let us only use them for test failures. For non-error messages, replace them with pr_info(). Suggested-by: Alexander Lobakin Signed-off-by: Alexander Potapenko Acked-by: Yury Norov --- v10-mte: - send this patch together with the "Implement MTE tag compression for swapped pages" --- lib/test_bitmap.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/lib/test_bitmap.c b/lib/test_bitmap.c index 46c0154680772..a6e92cf5266af 100644 --- a/lib/test_bitmap.c +++ b/lib/test_bitmap.c @@ -507,7 +507,7 @@ static void __init test_bitmap_parselist(void) } =20 if (ptest.flags & PARSE_TIME) - pr_err("parselist: %d: input is '%s' OK, Time: %llu\n", + pr_info("parselist: %d: input is '%s' OK, Time: %llu\n", i, ptest.in, time); =20 #undef ptest @@ -546,7 +546,7 @@ static void __init test_bitmap_printlist(void) goto out; } =20 - pr_err("bitmap_print_to_pagebuf: input is '%s', Time: %llu\n", buf, time); + pr_info("bitmap_print_to_pagebuf: input is '%s', Time: %llu\n", buf, time= ); out: kfree(buf); kfree(bmap); @@ -624,7 +624,7 @@ static void __init test_bitmap_parse(void) } =20 if (test.flags & PARSE_TIME) - pr_err("parse: %d: input is '%s' OK, Time: %llu\n", + pr_info("parse: %d: input is '%s' OK, Time: %llu\n", i, test.in, time); } } @@ -1380,7 +1380,7 @@ static void __init test_bitmap_read_perf(void) } } time =3D ktime_get() - time; - pr_err("Time spent in %s:\t%llu\n", __func__, time); + pr_info("Time spent in %s:\t%llu\n", __func__, time); } =20 static void __init test_bitmap_write_perf(void) @@ -1402,7 +1402,7 @@ static void __init test_bitmap_write_perf(void) } } time =3D ktime_get() - time; - pr_err("Time spent in %s:\t%llu\n", __func__, time); + pr_info("Time spent in %s:\t%llu\n", __func__, time); } =20 #undef TEST_BIT_LEN --=20 2.43.0.472.g3155946c3a-goog From nobody Sun Dec 28 00:48:26 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9847AC41535 for ; Thu, 14 Dec 2023 11:07:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1443997AbjLNLHS (ORCPT ); Thu, 14 Dec 2023 06:07:18 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42766 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1443948AbjLNLHE (ORCPT ); Thu, 14 Dec 2023 06:07:04 -0500 Received: from mail-yw1-x114a.google.com (mail-yw1-x114a.google.com [IPv6:2607:f8b0:4864:20::114a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E6DA5181 for ; Thu, 14 Dec 2023 03:06:57 -0800 (PST) Received: by mail-yw1-x114a.google.com with SMTP id 00721157ae682-5caf86963ecso89074657b3.3 for ; Thu, 14 Dec 2023 03:06:57 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1702552017; x=1703156817; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=ZQcjn3HKRIFjRjPtUqZGi0kpXnoJ+H0gJj/u7XtaxV0=; b=TIV8rAoiEI4nk710mLSezMYIcGA+Ny583BCnVwZxn+qAicCI5d1zUgt8tXhf3PO4Nn 6zSTt1jZshagzTIVMAZSWvXRujhPUMLjV65CNp9qc8n0FjkCcWRg589GUuB1PY5QSrVB GOCa/PBO6fMjC7GfNnbiHSMolSYNegA2jgK2wcrP7zaMB1nkz3+qpPYHkYnG3Z85qWzu XUHHu1euwkLoawERiZWWfL1S9dHe4Wubj1lXJQMGhj/DMOror0m4n2I+Mmaj2AnGWDtI pR+IQBGzxCk3L+KwnzP6nSISZXLQI04So6jmMqKXUYxW99Zcd5Gn6D5/0Jta4ouxpAMs xjpA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1702552017; x=1703156817; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=ZQcjn3HKRIFjRjPtUqZGi0kpXnoJ+H0gJj/u7XtaxV0=; b=MSybbsD9lnaOfHZoY8XfROK/59vrZ3xpMz8N0gpRDyJ3SAxEpsAFGaxURPMC8enemT Afl/fyQhaeDIC18BikuWVWzWXQmpBzQIFZNiYn8zhCXsBgJpBsTFcHWALimat1S+G2+d XBN3eptJdXJz36thLAs18P2jbMOQ81LYJUfP7+9psf+DJEKsRYOA/gCVnGB/DkJp17ww eg21pCZVw1P+LYF+X2lEszrLlX3MF5EdiiZh51w1cpWo+nnNPhSpL4se6rf6yKKkrrCO 8ADMsbzfCjgfWIVq2Ja/hi8LfyrPovYQ0TzfEbVAd28c5yYktrCDsbrbtXYwoG6lqhLc vCcg== X-Gm-Message-State: AOJu0YyYP8eqyJ70uTzwuPsPQw5ZgqIGN25W2eqPfedikCNIOkYpi/hg zyTc02JPxvivEZv/6dtFGgJ2nObifNk= X-Google-Smtp-Source: AGHT+IHZUxlk2ydWjkVM3eEnz5UgSsZtEHF9ttwaUfzN+X+521ZJ4sG4g8taV4ZW5sGtfxhkEsk7VUFm2Qs= X-Received: from glider.muc.corp.google.com ([2a00:79e0:9c:201:8447:3e89:1f77:ff8a]) (user=glider job=sendgmr) by 2002:a05:690c:a81:b0:5df:4a9b:fb6c with SMTP id ci1-20020a05690c0a8100b005df4a9bfb6cmr109547ywb.3.1702552017196; Thu, 14 Dec 2023 03:06:57 -0800 (PST) Date: Thu, 14 Dec 2023 12:06:36 +0100 In-Reply-To: <20231214110639.2294687-1-glider@google.com> Mime-Version: 1.0 References: <20231214110639.2294687-1-glider@google.com> X-Mailer: git-send-email 2.43.0.472.g3155946c3a-goog Message-ID: <20231214110639.2294687-5-glider@google.com> Subject: [PATCH v10-mte 4/7] arm64: mte: implement CONFIG_ARM64_MTE_COMP From: Alexander Potapenko To: glider@google.com, catalin.marinas@arm.com, will@kernel.org, pcc@google.com, andreyknvl@gmail.com, andriy.shevchenko@linux.intel.com, aleksander.lobakin@intel.com, linux@rasmusvillemoes.dk, yury.norov@gmail.com, alexandru.elisei@arm.com Cc: linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, eugenis@google.com, syednwaris@gmail.com, william.gray@linaro.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The config implements the algorithm compressing memory tags for ARM MTE during swapping. The algorithm is based on RLE and specifically targets buffers of tags corresponding to a single page. In many cases a buffer can be compressed into 63 bits, making it possible to store it without additional memory allocation. Suggested-by: Evgenii Stepanov Signed-off-by: Alexander Potapenko Acked-by: Catalin Marinas --- v10-mte: - added Catalin's Acked-by: v8: - As suggested by Catalin Marinas, only compress tags if they can be stored inline. This simplifies the code drastically. - Update the documentation. - Split off patches introducing bitmap_read()/bitmap_write(). v6: - shuffle bits in inline handles so that they can't be confused with canonical pointers; - use kmem_cache_zalloc() to allocate compressed storage - correctly handle range size overflow - minor documentation fixes, clarify the special cases v5: - make code PAGE_SIZE-agnostic, remove hardcoded constants, updated the docs - implement debugfs interface - Address comments by Andy Shevchenko: - update description of mtecomp.c - remove redundant assignments, simplify mte_tags_to_ranges() - various code simplifications - introduce mtecomp.h - add :export: to Documentation/arch/arm64/mte-tag-compression.rst v4: - Addressed comments by Andy Shevchenko: - expanded "MTE" to "Memory Tagging Extension" in Kconfig - fixed kernel-doc comments, moved them to C source - changed variables to unsigned where applicable - some code simplifications, fewer unnecessary assignments - added the mte_largest_idx_bits() helper - added namespace prefixes to all functions - added missing headers (but removed bits.h) - Addressed comments by Yury Norov: - removed test-only functions from mtecomp.h - dropped the algoritm name (all functions are now prefixed with "mte") - added more comments - got rid of MTE_RANGES_INLINE - renamed bitmap_{get,set}_value() to bitmap_{read,write}() - moved the big comment explaining the algorithm to Documentation/arch/arm64/mte-tag-compression.rst, expanded it, add a link to it from Documentation/arch/arm64/index.rst - removed hardcoded ranges from mte_alloc_size()/mte_size_to_ranges() v3: - Addressed comments by Andy Shevchenko: - use bitmap_{set,get}_value() writte by Syed Nayyar Waris - switched to unsigned long everywhere (fewer casts) - simplified the code, removed redundant checks - dropped ea0_compress_inline() - added bit size constants and helpers to access the bitmap - explicitly initialize all compressed sizes in ea0_compress_to_buf() - initialize all handle bits v2: - as suggested by Yury Norov, switched from struct bitq (which is not needed anymore) to - add missing symbol exports --- Documentation/arch/arm64/index.rst | 1 + .../arch/arm64/mte-tag-compression.rst | 154 +++++++++++ arch/arm64/Kconfig | 11 + arch/arm64/include/asm/mtecomp.h | 39 +++ arch/arm64/mm/Makefile | 1 + arch/arm64/mm/mtecomp.c | 257 ++++++++++++++++++ arch/arm64/mm/mtecomp.h | 12 + 7 files changed, 475 insertions(+) create mode 100644 Documentation/arch/arm64/mte-tag-compression.rst create mode 100644 arch/arm64/include/asm/mtecomp.h create mode 100644 arch/arm64/mm/mtecomp.c create mode 100644 arch/arm64/mm/mtecomp.h diff --git a/Documentation/arch/arm64/index.rst b/Documentation/arch/arm64/= index.rst index d08e924204bf1..bf6c1583233a9 100644 --- a/Documentation/arch/arm64/index.rst +++ b/Documentation/arch/arm64/index.rst @@ -19,6 +19,7 @@ ARM64 Architecture legacy_instructions memory memory-tagging-extension + mte-tag-compression perf pointer-authentication ptdump diff --git a/Documentation/arch/arm64/mte-tag-compression.rst b/Documentati= on/arch/arm64/mte-tag-compression.rst new file mode 100644 index 0000000000000..8fe6b51a9db6d --- /dev/null +++ b/Documentation/arch/arm64/mte-tag-compression.rst @@ -0,0 +1,154 @@ +.. SPDX-License-Identifier: GPL-2.0 + +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D +Tag Compression for Memory Tagging Extension (MTE) +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D + +This document describes the algorithm used to compress memory tags used by= the +ARM Memory Tagging Extension (MTE). + +Introduction +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +MTE assigns tags to memory pages: for 4K pages those tags occupy 128 bytes +(256 4-bit tags each corresponding to a 16-byte MTE granule), for 16K page= s - +512 bytes, for 64K pages - 2048 bytes. By default, MTE carves out 3.125% (= 1/16) +of the available physical memory to store the tags. + +When MTE pages are saved to swap, their tags need to be stored in the kern= el +memory. If the system swap is used heavily, these tags may take a substant= ial +portion of the physical memory. To reduce memory waste, ``CONFIG_ARM64_MTE= _COMP`` +allows the kernel to store the tags in compressed form. + +Implementation details +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +The algorithm attempts to compress an array of ``MTE_PAGE_TAG_STORAGE`` +tag bytes into a byte sequence that can be stored in an 8-byte pointer. If= that +is not possible, the data is stored uncompressed. + +Tag manipulation and storage +---------------------------- + +Tags for swapped pages are stored in an XArray that maps swap entries to 6= 3-bit +values (see ``arch/arm64/mm/mteswap.c``). Bit 0 of these values indicates = how +their contents should be treated: + + - 0: value is a pointer to an uncompressed buffer allocated with kmalloc() + (always the case if ``CONFIG_ARM64_MTE_COMP=3Dn``) with the highest bit= set + to 0; + - 1: value contains compressed data. + +``arch/arm64/include/asm/mtecomp.h`` declares the following functions that +manipulate with tags: + +- mte_compress() - compresses the given ``MTE_PAGE_TAG_STORAGE``-byte ``ta= gs`` + buffer into a pointer; +- mte_decompress() - decompresses the tags from a pointer; +- mte_is_compressed() - returns ``true`` iff the pointer passed to it shou= ld be + treated as compressed data. + +Tag compression +--------------- + +The compression algorithm is a variation of RLE (run-length encoding) and = works +as follows (we will be considering 4K pages and 128-byte tag buffers, but = the +same approach scales to 16K and 64K pages): + +1. The input array of 128 (``MTE_PAGE_TAG_STORAGE``) bytes is transformed = into + tag ranges (two arrays: ``r_tags[]`` containing tag values and ``r_size= s[]`` + containing range lengths) by mte_tags_to_ranges(). Note that + ``r_sizes[]`` sums up to 256 (``MTE_GRANULES_PER_PAGE``). + + If ``r_sizes[]`` consists of a single element + (``{ MTE_GRANULES_PER_PAGE }``), the corresponding range is split into = two + halves, i.e.:: + + r_sizes_new[2] =3D { MTE_GRANULES_PER_PAGE/2, MTE_GRANULES_PER_PAGE/2= }; + r_tags_new[2] =3D { r_tags[0], r_tags[0] }; + +2. The number of the largest element of ``r_sizes[]`` is stored in + ``largest_idx``. The element itself is thrown away from ``r_sizes[]``, + because it can be reconstructed from the sum of the remaining elements.= Note + that now none of the remaining ``r_sizes[]`` elements exceeds + ``MTE_GRANULES_PER_PAGE/2``. + +3. If the number ``N`` of ranges does not exceed ``6``, the ranges can be + compressed into 64 bits. This is done by storing the following values p= acked + into the pointer (``i`` means a ````-bit unsigned integer) + treated as a bitmap (see ``include/linux/bitmap.h``):: + + bit 0 : (always 1) : i1 + bits 1-3 : largest_idx : i3 + bits 4-27 : r_tags[0..5] : i4 x 6 + bits 28-62 : r_sizes[0..4] : i7 x 5 + bit 63 : (always 0) : i1 + + If N is less than 6, ``r_tags`` and ``r_sizes`` are padded up with zero + values. The unused bits in the pointer, including bit 63, are also set = to 0, + so the compressed data can be stored in XArray. + + Range size of ``MTE_GRANULES_PER_PAGE/2`` (at most one) does not fit in= to + i7 and will be written as 0. This case is handled separately by the + decompressing procedure. + +Tag decompression +----------------- + +The decompression algorithm performs the steps below. + +1. Read the lowest bit of the data from the input buffer and check that it= is 1, + otherwise bail out. + +2. Read ``largest_idx``, ``r_tags[]`` and ``r_sizes[]`` from the + input buffer. + + If ``largest_idx`` is zero, and all ``r_sizes[]`` are zero, set + ``r_sizes[0] =3D MTE_GRANULES_PER_PAGE/2``. + + Calculate the removed largest element of ``r_sizes[]`` as + ``largest =3D 256 - sum(r_sizes)`` and insert it into ``r_sizes`` at + position ``largest_idx``. + +6. For each ``r_sizes[i] > 0``, add a 4-bit value ``r_tags[i]`` to the out= put + buffer ``r_sizes[i]`` times. + + +Why these numbers? +------------------ + +To be able to reconstruct ``N`` tag ranges from the compressed data, we ne= ed to +store the indicator bit together with ``largest_idx``, ``r_tags[N]``, and +``r_sizes[N-1]`` in 63 bits. +Knowing that the sizes do not exceed ``MTE_PAGE_TAG_STORAGE``, each of the= m can be +packed into ``S =3D ilog2(MTE_PAGE_TAG_STORAGE)`` bits, whereas a single t= ag occupies +4 bits. + +It is evident that the number of ranges that can be stored in 63 bits is +strictly less than 8, therefore we only need 3 bits to store ``largest_idx= ``. + +The maximum values of ``N`` so that the number ``1 + 3 + N * 4 + (N-1) * S= `` of +storage bits does not exceed 63, are shown in the table below:: + + +-----------+-----------------+----+---+-------------------+ + | Page size | Tag buffer size | S | N | Storage bits | + +-----------+-----------------+----+---+-------------------+ + | 4 KB | 128 B | 7 | 6 | 63 =3D 1+3+6*4+5*7 | + | 16 KB | 512 B | 9 | 5 | 60 =3D 1+3+5*4+4*9 | + | 64 KB | 2048 B | 11 | 4 | 53 =3D 1+3+4*4+3*11 | + +-----------+-----------------+----+---+-------------------+ + +Note +---- + +Tag compression and decompression implicitly rely on the fixed MTE tag size +(4 bits) and number of tags per page. Should these values change, the algo= rithm +may need to be revised. + + +Programming Interface +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + + .. kernel-doc:: arch/arm64/include/asm/mtecomp.h + .. kernel-doc:: arch/arm64/mm/mtecomp.c + :export: diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index 7b071a00425d2..5f4d4b49a512e 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -2078,6 +2078,17 @@ config ARM64_EPAN if the cpu does not implement the feature. endmenu # "ARMv8.7 architectural features" =20 +config ARM64_MTE_COMP + bool "Tag compression for ARM64 Memory Tagging Extension" + default y + depends on ARM64_MTE + help + Enable tag compression support for ARM64 Memory Tagging Extension. + + Tag buffers corresponding to swapped RAM pages are compressed using + RLE to conserve heap memory. In the common case compressed tags + occupy 2.5x less memory. + config ARM64_SVE bool "ARM Scalable Vector Extension support" default y diff --git a/arch/arm64/include/asm/mtecomp.h b/arch/arm64/include/asm/mtec= omp.h new file mode 100644 index 0000000000000..b9a3a921a38d4 --- /dev/null +++ b/arch/arm64/include/asm/mtecomp.h @@ -0,0 +1,39 @@ +/* SPDX-License-Identifier: GPL-2.0 */ + +#ifndef __ASM_MTECOMP_H +#define __ASM_MTECOMP_H + +#include + +/** + * mte_is_compressed() - check if the supplied pointer contains compressed= tags. + * @ptr: pointer returned by kmalloc() or mte_compress(). + * + * Returns: true iff bit 0 of @ptr is 1, which is only possible if @ptr was + * returned by mte_is_compressed(). + */ +static inline bool mte_is_compressed(void *ptr) +{ + return ((unsigned long)ptr & 1); +} + +#if defined(CONFIG_ARM64_MTE_COMP) + +void *mte_compress(u8 *tags); +bool mte_decompress(void *handle, u8 *tags); + +#else + +static inline void *mte_compress(u8 *tags) +{ + return NULL; +} + +static inline bool mte_decompress(void *data, u8 *tags) +{ + return false; +} + +#endif // CONFIG_ARM64_MTE_COMP + +#endif // __ASM_MTECOMP_H diff --git a/arch/arm64/mm/Makefile b/arch/arm64/mm/Makefile index dbd1bc95967d0..46778f6dd83c2 100644 --- a/arch/arm64/mm/Makefile +++ b/arch/arm64/mm/Makefile @@ -10,6 +10,7 @@ obj-$(CONFIG_TRANS_TABLE) +=3D trans_pgd.o obj-$(CONFIG_TRANS_TABLE) +=3D trans_pgd-asm.o obj-$(CONFIG_DEBUG_VIRTUAL) +=3D physaddr.o obj-$(CONFIG_ARM64_MTE) +=3D mteswap.o +obj-$(CONFIG_ARM64_MTE_COMP) +=3D mtecomp.o KASAN_SANITIZE_physaddr.o +=3D n =20 obj-$(CONFIG_KASAN) +=3D kasan_init.o diff --git a/arch/arm64/mm/mtecomp.c b/arch/arm64/mm/mtecomp.c new file mode 100644 index 0000000000000..c948921525030 --- /dev/null +++ b/arch/arm64/mm/mtecomp.c @@ -0,0 +1,257 @@ +// SPDX-License-Identifier: GPL-2.0-only + +/* + * MTE tag compression algorithm. + * See Documentation/arch/arm64/mte-tag-compression.rst for more details. + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include + +#include "mtecomp.h" + +#define MTE_BITS_PER_LARGEST_IDX 3 +/* Range size cannot exceed MTE_GRANULES_PER_PAGE / 2. */ +#define MTE_BITS_PER_SIZE (ilog2(MTE_GRANULES_PER_PAGE) - 1) + +/* + * See Documentation/arch/arm64/mte-tag-compression.rst for details on how= the + * maximum number of ranges is calculated. + */ +#if defined(CONFIG_ARM64_4K_PAGES) +#define MTE_MAX_RANGES 6 +#elif defined(CONFIG_ARM64_16K_PAGES) +#define MTE_MAX_RANGES 5 +#else +#define MTE_MAX_RANGES 4 +#endif + +/** + * mte_tags_to_ranges() - break @tags into arrays of tag ranges. + * @tags: MTE_GRANULES_PER_PAGE-byte array containing MTE tags. + * @out_tags: u8 array to store the tag of every range. + * @out_sizes: unsigned short array to store the size of every range. + * @out_len: length of @out_tags and @out_sizes (output parameter, initial= ly + * equal to lengths of out_tags[] and out_sizes[]). + * + * This function is exported for testing purposes. + */ +void mte_tags_to_ranges(u8 *tags, u8 *out_tags, unsigned short *out_sizes, + size_t *out_len) +{ + u8 prev_tag =3D tags[0] / 16; /* First tag in the array. */ + unsigned int cur_idx =3D 0, i, j; + u8 cur_tag; + + memset(out_tags, 0, array_size(*out_len, sizeof(*out_tags))); + memset(out_sizes, 0, array_size(*out_len, sizeof(*out_sizes))); + + out_tags[cur_idx] =3D prev_tag; + for (i =3D 0; i < MTE_GRANULES_PER_PAGE; i++) { + j =3D i % 2; + cur_tag =3D j ? (tags[i / 2] % 16) : (tags[i / 2] / 16); + if (cur_tag =3D=3D prev_tag) { + out_sizes[cur_idx]++; + } else { + cur_idx++; + prev_tag =3D cur_tag; + out_tags[cur_idx] =3D prev_tag; + out_sizes[cur_idx] =3D 1; + } + } + *out_len =3D cur_idx + 1; +} +EXPORT_SYMBOL_NS(mte_tags_to_ranges, MTECOMP); + +/** + * mte_ranges_to_tags() - fill @tags using given tag ranges. + * @r_tags: u8[] containing the tag of every range. + * @r_sizes: unsigned short[] containing the size of every range. + * @r_len: length of @r_tags and @r_sizes. + * @tags: MTE_GRANULES_PER_PAGE-byte array to write the tags to. + * + * This function is exported for testing purposes. + */ +void mte_ranges_to_tags(u8 *r_tags, unsigned short *r_sizes, size_t r_len, + u8 *tags) +{ + unsigned int i, j, pos =3D 0; + u8 prev; + + for (i =3D 0; i < r_len; i++) { + for (j =3D 0; j < r_sizes[i]; j++) { + if (pos % 2) + tags[pos / 2] =3D (prev << 4) | r_tags[i]; + else + prev =3D r_tags[i]; + pos++; + } + } +} +EXPORT_SYMBOL_NS(mte_ranges_to_tags, MTECOMP); + +static void mte_bitmap_write(unsigned long *bitmap, unsigned long value, + unsigned long *pos, unsigned long bits) +{ + bitmap_write(bitmap, value, *pos, bits); + *pos +=3D bits; +} + +/* Compress ranges into an unsigned long. */ +static void mte_compress_to_ulong(size_t len, u8 *tags, unsigned short *si= zes, + unsigned long *result) +{ + unsigned long bit_pos =3D 0; + unsigned int largest_idx, i; + unsigned short largest =3D 0; + + for (i =3D 0; i < len; i++) { + if (sizes[i] > largest) { + largest =3D sizes[i]; + largest_idx =3D i; + } + } + /* Bit 1 in position 0 indicates compressed data. */ + mte_bitmap_write(result, 1, &bit_pos, 1); + mte_bitmap_write(result, largest_idx, &bit_pos, + MTE_BITS_PER_LARGEST_IDX); + for (i =3D 0; i < len; i++) + mte_bitmap_write(result, tags[i], &bit_pos, MTE_TAG_SIZE); + if (len =3D=3D 1) { + /* + * We are compressing MTE_GRANULES_PER_PAGE of identical tags. + * Split it into two ranges containing + * MTE_GRANULES_PER_PAGE / 2 tags, so that it falls into the + * special case described below. + */ + mte_bitmap_write(result, tags[0], &bit_pos, MTE_TAG_SIZE); + i =3D 2; + } else { + i =3D len; + } + for (; i < MTE_MAX_RANGES; i++) + mte_bitmap_write(result, 0, &bit_pos, MTE_TAG_SIZE); + /* + * Most of the time sizes[i] fits into MTE_BITS_PER_SIZE, apart from a + * special case when: + * len =3D 2; + * sizes =3D { MTE_GRANULES_PER_PAGE / 2, MTE_GRANULES_PER_PAGE / 2}; + * In this case largest_idx will be set to 0, and the size written to + * the bitmap will be also 0. + */ + for (i =3D 0; i < len; i++) { + if (i !=3D largest_idx) + mte_bitmap_write(result, sizes[i], &bit_pos, + MTE_BITS_PER_SIZE); + } + for (i =3D len; i < MTE_MAX_RANGES; i++) + mte_bitmap_write(result, 0, &bit_pos, MTE_BITS_PER_SIZE); +} + +/** + * mte_compress() - compress the given tag array. + * @tags: MTE_GRANULES_PER_PAGE-byte array to read the tags from. + * + * Attempts to compress the user-supplied tag array. + * + * Returns: compressed data or NULL. + */ +void *mte_compress(u8 *tags) +{ + unsigned short *r_sizes; + void *result =3D NULL; + u8 *r_tags; + size_t r_len; + + r_sizes =3D kmalloc_array(MTE_GRANULES_PER_PAGE, sizeof(unsigned short), + GFP_KERNEL); + r_tags =3D kmalloc(MTE_GRANULES_PER_PAGE, GFP_KERNEL); + if (!r_sizes || !r_tags) + goto ret; + r_len =3D MTE_GRANULES_PER_PAGE; + mte_tags_to_ranges(tags, r_tags, r_sizes, &r_len); + if (r_len <=3D MTE_MAX_RANGES) + mte_compress_to_ulong(r_len, r_tags, r_sizes, + (unsigned long *)&result); +ret: + kfree(r_tags); + kfree(r_sizes); + return result; +} +EXPORT_SYMBOL_NS(mte_compress, MTECOMP); + +static unsigned long mte_bitmap_read(const unsigned long *bitmap, + unsigned long *pos, unsigned long bits) +{ + unsigned long start =3D *pos; + + *pos +=3D bits; + return bitmap_read(bitmap, start, bits); +} + +/** + * mte_decompress() - decompress the tag array from the given pointer. + * @data: pointer returned by @mte_compress() + * @tags: MTE_GRANULES_PER_PAGE-byte array to write the tags to. + * + * Reads the compressed data and writes it into the user-supplied tag arra= y. + * + * Returns: true on success, false if the passed data is uncompressed. + */ +bool mte_decompress(void *data, u8 *tags) +{ + unsigned short r_sizes[MTE_MAX_RANGES]; + u8 r_tags[MTE_MAX_RANGES]; + unsigned int largest_idx, i; + unsigned long bit_pos =3D 0; + unsigned long *bitmap; + unsigned short sum; + size_t max_ranges; + + if (!mte_is_compressed(data)) + return false; + + bitmap =3D (unsigned long *)&data; + max_ranges =3D MTE_MAX_RANGES; + /* Skip the leading bit indicating the inline case. */ + mte_bitmap_read(bitmap, &bit_pos, 1); + largest_idx =3D + mte_bitmap_read(bitmap, &bit_pos, MTE_BITS_PER_LARGEST_IDX); + if (largest_idx >=3D MTE_MAX_RANGES) + return false; + + for (i =3D 0; i < max_ranges; i++) + r_tags[i] =3D mte_bitmap_read(bitmap, &bit_pos, MTE_TAG_SIZE); + for (i =3D 0, sum =3D 0; i < max_ranges; i++) { + if (i =3D=3D largest_idx) + continue; + r_sizes[i] =3D + mte_bitmap_read(bitmap, &bit_pos, MTE_BITS_PER_SIZE); + /* + * Special case: tag array consists of two ranges of + * `MTE_GRANULES_PER_PAGE / 2` tags. + */ + if ((largest_idx =3D=3D 0) && (i =3D=3D 1) && (r_sizes[i] =3D=3D 0)) + r_sizes[i] =3D MTE_GRANULES_PER_PAGE / 2; + if (!r_sizes[i]) { + max_ranges =3D i; + break; + } + sum +=3D r_sizes[i]; + } + if (sum >=3D MTE_GRANULES_PER_PAGE) + return false; + r_sizes[largest_idx] =3D MTE_GRANULES_PER_PAGE - sum; + mte_ranges_to_tags(r_tags, r_sizes, max_ranges, tags); + return true; +} +EXPORT_SYMBOL_NS(mte_decompress, MTECOMP); diff --git a/arch/arm64/mm/mtecomp.h b/arch/arm64/mm/mtecomp.h new file mode 100644 index 0000000000000..b94cf0384f2af --- /dev/null +++ b/arch/arm64/mm/mtecomp.h @@ -0,0 +1,12 @@ +/* SPDX-License-Identifier: GPL-2.0 */ + +#ifndef ARCH_ARM64_MM_MTECOMP_H_ +#define ARCH_ARM64_MM_MTECOMP_H_ + +/* Functions exported from mtecomp.c for test_mtecomp.c. */ +void mte_tags_to_ranges(u8 *tags, u8 *out_tags, unsigned short *out_sizes, + size_t *out_len); +void mte_ranges_to_tags(u8 *r_tags, unsigned short *r_sizes, size_t r_len, + u8 *tags); + +#endif // ARCH_ARM64_MM_TEST_MTECOMP_H_ --=20 2.43.0.472.g3155946c3a-goog From nobody Sun Dec 28 00:48:26 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 30992C4332F for ; Thu, 14 Dec 2023 11:07:40 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1444066AbjLNLHa (ORCPT ); Thu, 14 Dec 2023 06:07:30 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34156 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1443979AbjLNLHO (ORCPT ); Thu, 14 Dec 2023 06:07:14 -0500 Received: from mail-wr1-x449.google.com (mail-wr1-x449.google.com [IPv6:2a00:1450:4864:20::449]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7BF2F191 for ; Thu, 14 Dec 2023 03:07:01 -0800 (PST) Received: by mail-wr1-x449.google.com with SMTP id ffacd0b85a97d-333501e22caso6205383f8f.2 for ; Thu, 14 Dec 2023 03:07:01 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1702552020; x=1703156820; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=Z23ZDr01NkgZyjyXcvWE5FnYbJhU70TfuuBALSdMK5s=; b=O94Gb3hIlovf0Re2+/vnaVhtSDte358hXAzkCXS636turm0lAFyke2fN7VC2Ygq9mR flt2SYBMrk3L9Xg24nv0Xx+r/jnQm8NyQrbSS87U1vCd7/269WZPUVQXe9cGn8EiBguv J3+7cyWWzS97Nwmj4Rl71uZkR0LRytpuDNoa0c6wBcxVLgPrIILgiV8CqtJ4WmHE4KUm 2zknCOY1/H4P+pXSuZUj0YN9Eh//Oi0lEuFqXdag7jQ5xQIYeHYdySIr2r8KkdBfOQgq wEjpYD3SGMiIOcI2h8f9g3Eu9IfrWb7p+2Jc+q0lGkxqbHAvi/dVhVPzgxHfdkh88qgx HJCw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1702552020; x=1703156820; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=Z23ZDr01NkgZyjyXcvWE5FnYbJhU70TfuuBALSdMK5s=; b=uk7x2nPCujgtm5wlh5KwFPP5ne/syRJ8IFWKpp3WmRVeTxnbY11LXrycSh+JPZowSs 4XB04TV0oSwCcUh2KuDJPlkHkaFP4jqgFqMsx2xM0kCKUYo475JaHHBvMh//iuz/9bek Qair3dC+ITYUNnqPjvWydixAE4PWZ7DvdxjidjPg+eCm48As3Mn3BRXi8t20C7YwFxm7 m6z+9Uxn1bSjkkkOGrhe+iff7zggZ4ShWp7tQhu3kk7B+v0KB8o/GIkuVi4V/dCs0HNf KuO4JOEaP206ri++aKnng/PIJad02Mtg2FkbEu6ZQWPwUT1OKnpvW0pEuUChhclWoGSF eNOA== X-Gm-Message-State: AOJu0YxPAlD0NlDRm1M4uN6U5uFpl/Jh9kE/k1ZMUyGz8zkX9T4e6rBu 5+a7BDb3flrUI9P1CsDt+LPo9Xv4H/E= X-Google-Smtp-Source: AGHT+IEffuRK7eMhk5v3MPxIHyqDU9QxpDmCoK2fcfSBY82IEhz5qti3EblDU7cstiHLqMSjdDiZR9T2KOI= X-Received: from glider.muc.corp.google.com ([2a00:79e0:9c:201:8447:3e89:1f77:ff8a]) (user=glider job=sendgmr) by 2002:a5d:47ad:0:b0:336:365c:ac3e with SMTP id 13-20020a5d47ad000000b00336365cac3emr20573wrb.10.1702552019764; Thu, 14 Dec 2023 03:06:59 -0800 (PST) Date: Thu, 14 Dec 2023 12:06:37 +0100 In-Reply-To: <20231214110639.2294687-1-glider@google.com> Mime-Version: 1.0 References: <20231214110639.2294687-1-glider@google.com> X-Mailer: git-send-email 2.43.0.472.g3155946c3a-goog Message-ID: <20231214110639.2294687-6-glider@google.com> Subject: [PATCH v10-mte 5/7] arm64: mte: add a test for MTE tags compression From: Alexander Potapenko To: glider@google.com, catalin.marinas@arm.com, will@kernel.org, pcc@google.com, andreyknvl@gmail.com, andriy.shevchenko@linux.intel.com, aleksander.lobakin@intel.com, linux@rasmusvillemoes.dk, yury.norov@gmail.com, alexandru.elisei@arm.com Cc: linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, eugenis@google.com, syednwaris@gmail.com, william.gray@linaro.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Ensure that tag sequences containing alternating values are compressed to buffers of expected size and correctly decompressed afterwards. Signed-off-by: Alexander Potapenko Acked-by: Catalin Marinas --- v10-mte: - added Catalin's Acked-by: v9: - minor changes to Kconfig description v8: - adapt to the simplified compression algorithm v6: - add test_decompress_invalid() to ensure invalid handles are ignored; - add test_upper_bits(), which is a regression test for a case where an inline handle looked like an out-of-line one; - add test_compress_nonzero() to ensure a full nonzero tag array is compressed correctly; - add test_two_ranges() to test cases when the input buffer is divided into two ranges. v5: - remove hardcoded constants, added test setup/teardown; - support 16- and 64K pages; - replace nested if-clauses with expected_size_from_ranges(); - call mte_release_handle() after tests that perform compression/decompression; - address comments by Andy Shevchenko: - fix include order; - use mtecomp.h instead of function prototypes. v4: - addressed comments by Andy Shevchenko: - expanded MTE to "Memory Tagging Extension" in Kconfig - changed signed variables to unsigned where applicable - added missing header dependencies - addressed comments by Yury Norov: - moved test-only declarations from mtecomp.h into this test - switched to the new "mte"-prefixed function names, dropped the mentions of "EA0" - added test_tag_to_ranges_n() v3: - addressed comments by Andy Shevchenko in another patch: - switched from u64 to unsigned long - added MODULE_IMPORT_NS(MTECOMP) - fixed includes order --- arch/arm64/Kconfig | 11 ++ arch/arm64/mm/Makefile | 1 + arch/arm64/mm/test_mtecomp.c | 364 +++++++++++++++++++++++++++++++++++ 3 files changed, 376 insertions(+) create mode 100644 arch/arm64/mm/test_mtecomp.c diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index 5f4d4b49a512e..6a1397a96f2f0 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -2089,6 +2089,17 @@ config ARM64_MTE_COMP RLE to conserve heap memory. In the common case compressed tags occupy 2.5x less memory. =20 +config ARM64_MTE_COMP_KUNIT_TEST + tristate "Test tag compression for ARM64 Memory Tagging Extension" if !KU= NIT_ALL_TESTS + default KUNIT_ALL_TESTS + depends on KUNIT && ARM64_MTE_COMP + help + Test MTE compression algorithm enabled by CONFIG_ARM64_MTE_COMP. + + Ensure that certain tag sequences containing alternating values can + be compressed into pointer-size values and correctly decompressed + afterwards. + config ARM64_SVE bool "ARM Scalable Vector Extension support" default y diff --git a/arch/arm64/mm/Makefile b/arch/arm64/mm/Makefile index 46778f6dd83c2..170dc62b010b9 100644 --- a/arch/arm64/mm/Makefile +++ b/arch/arm64/mm/Makefile @@ -11,6 +11,7 @@ obj-$(CONFIG_TRANS_TABLE) +=3D trans_pgd-asm.o obj-$(CONFIG_DEBUG_VIRTUAL) +=3D physaddr.o obj-$(CONFIG_ARM64_MTE) +=3D mteswap.o obj-$(CONFIG_ARM64_MTE_COMP) +=3D mtecomp.o +obj-$(CONFIG_ARM64_MTE_COMP_KUNIT_TEST) +=3D test_mtecomp.o KASAN_SANITIZE_physaddr.o +=3D n =20 obj-$(CONFIG_KASAN) +=3D kasan_init.o diff --git a/arch/arm64/mm/test_mtecomp.c b/arch/arm64/mm/test_mtecomp.c new file mode 100644 index 0000000000000..e8aeb7607ff41 --- /dev/null +++ b/arch/arm64/mm/test_mtecomp.c @@ -0,0 +1,364 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Test cases for MTE tags compression algorithm. + */ + +#include +#include +#include +#include +#include + +#include + +#include + +#include "mtecomp.h" + +/* Per-test storage allocated in mtecomp_test_init(). */ +struct test_data { + u8 *tags, *dtags; + unsigned short *r_sizes; + size_t r_len; + u8 *r_tags; +}; + +/* + * Split td->tags to ranges stored in td->r_tags, td->r_sizes, td->r_len, + * then convert those ranges back to tags stored in td->dtags. + */ +static void tags_to_ranges_to_tags_helper(struct kunit *test) +{ + struct test_data *td =3D test->priv; + + mte_tags_to_ranges(td->tags, td->r_tags, td->r_sizes, &td->r_len); + mte_ranges_to_tags(td->r_tags, td->r_sizes, td->r_len, td->dtags); + KUNIT_EXPECT_EQ(test, memcmp(td->tags, td->dtags, MTE_PAGE_TAG_STORAGE), + 0); +} + +/* + * Test that mte_tags_to_ranges() produces a single range for a zero-fille= d tag + * buffer. + */ +static void test_tags_to_ranges_zero(struct kunit *test) +{ + struct test_data *td =3D test->priv; + + memset(td->tags, 0, MTE_PAGE_TAG_STORAGE); + tags_to_ranges_to_tags_helper(test); + + KUNIT_EXPECT_EQ(test, td->r_len, 1); + KUNIT_EXPECT_EQ(test, td->r_tags[0], 0); + KUNIT_EXPECT_EQ(test, td->r_sizes[0], MTE_GRANULES_PER_PAGE); +} + +/* + * Test that a small number of different tags is correctly transformed into + * ranges. + */ +static void test_tags_to_ranges_simple(struct kunit *test) +{ + struct test_data *td =3D test->priv; + const u8 ex_tags[] =3D { 0xa, 0x0, 0xa, 0xb, 0x0 }; + const unsigned short ex_sizes[] =3D { 1, 2, 2, 1, + MTE_GRANULES_PER_PAGE - 6 }; + + memset(td->tags, 0, MTE_PAGE_TAG_STORAGE); + td->tags[0] =3D 0xa0; + td->tags[1] =3D 0x0a; + td->tags[2] =3D 0xab; + tags_to_ranges_to_tags_helper(test); + + KUNIT_EXPECT_EQ(test, td->r_len, 5); + KUNIT_EXPECT_EQ(test, memcmp(td->r_tags, ex_tags, sizeof(ex_tags)), 0); + KUNIT_EXPECT_EQ(test, memcmp(td->r_sizes, ex_sizes, sizeof(ex_sizes)), + 0); +} + +/* Test that repeated 0xa0 byte produces MTE_GRANULES_PER_PAGE ranges of l= ength 1. */ +static void test_tags_to_ranges_repeated(struct kunit *test) +{ + struct test_data *td =3D test->priv; + + memset(td->tags, 0xa0, MTE_PAGE_TAG_STORAGE); + tags_to_ranges_to_tags_helper(test); + + KUNIT_EXPECT_EQ(test, td->r_len, MTE_GRANULES_PER_PAGE); +} + +/* Generate a buffer that will contain @nranges of tag ranges. */ +static void gen_tag_range_helper(u8 *tags, int nranges) +{ + unsigned int i; + + memset(tags, 0, MTE_PAGE_TAG_STORAGE); + if (nranges > 1) { + nranges--; + for (i =3D 0; i < nranges / 2; i++) + tags[i] =3D 0xab; + if (nranges % 2) + tags[nranges / 2] =3D 0xa0; + } +} + +/* + * Test that mte_tags_to_ranges()/mte_ranges_to_tags() work for various + * r_len values. + */ +static void test_tag_to_ranges_n(struct kunit *test) +{ + struct test_data *td =3D test->priv; + unsigned int i, j, sum; + + for (i =3D 1; i <=3D MTE_GRANULES_PER_PAGE; i++) { + gen_tag_range_helper(td->tags, i); + tags_to_ranges_to_tags_helper(test); + sum =3D 0; + for (j =3D 0; j < td->r_len; j++) + sum +=3D td->r_sizes[j]; + KUNIT_EXPECT_EQ(test, sum, MTE_GRANULES_PER_PAGE); + } +} + +/* + * Check that the tag buffer in test->priv can be compressed and decompres= sed + * without changes. + */ +static void *compress_decompress_helper(struct kunit *test) +{ + struct test_data *td =3D test->priv; + void *handle; + + handle =3D mte_compress(td->tags); + KUNIT_EXPECT_EQ(test, (unsigned long)handle & BIT_ULL(63), 0); + if (handle) { + KUNIT_EXPECT_TRUE(test, mte_decompress(handle, td->dtags)); + KUNIT_EXPECT_EQ(test, memcmp(td->tags, td->dtags, MTE_PAGE_TAG_STORAGE), + 0); + } + return handle; +} + +/* Test that a zero-filled array is compressed into inline storage. */ +static void test_compress_zero(struct kunit *test) +{ + struct test_data *td =3D test->priv; + void *handle; + + memset(td->tags, 0, MTE_PAGE_TAG_STORAGE); + handle =3D compress_decompress_helper(test); + /* Tags are stored inline. */ + KUNIT_EXPECT_TRUE(test, mte_is_compressed(handle)); +} + +/* Test that a 0xaa-filled array is compressed into inline storage. */ +static void test_compress_nonzero(struct kunit *test) +{ + struct test_data *td =3D test->priv; + void *handle; + + memset(td->tags, 0xaa, MTE_PAGE_TAG_STORAGE); + handle =3D compress_decompress_helper(test); + /* Tags are stored inline. */ + KUNIT_EXPECT_TRUE(test, mte_is_compressed(handle)); +} + +/* + * Test that two tag ranges are compressed into inline storage. + * + * This also covers a special case where both ranges contain + * `MTE_GRANULES_PER_PAGE / 2` tags and overflow the designated range size. + */ +static void test_two_ranges(struct kunit *test) +{ + struct test_data *td =3D test->priv; + void *handle; + unsigned int i; + size_t r_len =3D 2; + unsigned char r_tags[2] =3D { 0xe, 0x0 }; + unsigned short r_sizes[2]; + + for (i =3D 1; i < MTE_GRANULES_PER_PAGE; i++) { + r_sizes[0] =3D i; + r_sizes[1] =3D MTE_GRANULES_PER_PAGE - i; + mte_ranges_to_tags(r_tags, r_sizes, r_len, td->tags); + handle =3D compress_decompress_helper(test); + KUNIT_EXPECT_TRUE(test, mte_is_compressed(handle)); + } +} + +/* + * Test that a very small number of tag ranges ends up compressed into 8 b= ytes. + */ +static void test_compress_simple(struct kunit *test) +{ + struct test_data *td =3D test->priv; + void *handle; + + memset(td->tags, 0, MTE_PAGE_TAG_STORAGE); + td->tags[0] =3D 0xa0; + td->tags[1] =3D 0x0a; + + handle =3D compress_decompress_helper(test); + /* Tags are stored inline. */ + KUNIT_EXPECT_TRUE(test, mte_is_compressed(handle)); +} + +/* + * Test that a buffer containing @nranges ranges compresses into @exp_size + * bytes and decompresses into the original tag sequence. + */ +static void compress_range_helper(struct kunit *test, int nranges, + bool exp_inl) +{ + struct test_data *td =3D test->priv; + void *handle; + + gen_tag_range_helper(td->tags, nranges); + handle =3D compress_decompress_helper(test); + KUNIT_EXPECT_EQ(test, mte_is_compressed(handle), exp_inl); +} + +static inline size_t max_inline_ranges(void) +{ +#if defined CONFIG_ARM64_4K_PAGES + return 6; +#elif defined(CONFIG_ARM64_16K_PAGES) + return 5; +#else + return 4; +#endif +} + +/* + * Test that every number of tag ranges is correctly compressed and + * decompressed. + */ +static void test_compress_ranges(struct kunit *test) +{ + unsigned int i; + bool exp_inl; + + for (i =3D 1; i <=3D MTE_GRANULES_PER_PAGE; i++) { + exp_inl =3D i <=3D max_inline_ranges(); + compress_range_helper(test, i, exp_inl); + } +} + +/* + * Test that invalid handles are ignored by mte_decompress(). + */ +static void test_decompress_invalid(struct kunit *test) +{ + void *handle1 =3D (void *)0xeb0b0b0100804020; + void *handle2 =3D (void *)0x6b0b0b010080402f; + struct test_data *td =3D test->priv; + + /* handle1 has bit 0 set to 1. */ + KUNIT_EXPECT_FALSE(test, mte_decompress(handle1, td->dtags)); + /* + * handle2 is an inline handle, but its largest_idx (bits 1..3) + * is out of bounds for the inline storage. + */ + KUNIT_EXPECT_FALSE(test, mte_decompress(handle2, td->dtags)); +} + +/* + * Test that compressed inline tags cannot be confused with out-of-line + * pointers. + * + * Compressed values are written from bit 0 to bit 63, so the size of the = last + * tag range initially ends up in the upper bits of the inline representat= ion. + * Make sure mte_compress() rearranges the bits so that the resulting hand= le does + * not have 0b0111 as the upper four bits. + */ +static void test_upper_bits(struct kunit *test) +{ + struct test_data *td =3D test->priv; + void *handle; + unsigned char r_tags[6] =3D { 7, 0, 7, 0, 7, 0 }; + unsigned short r_sizes[6] =3D { 1, 1, 1, 1, 1, 1 }; + size_t r_len; + + /* Maximum number of ranges that can be encoded inline. */ + r_len =3D max_inline_ranges(); + /* Maximum range size possible, will be omitted. */ + r_sizes[0] =3D MTE_GRANULES_PER_PAGE / 2 - 1; + /* A number close to r_sizes[0] that has most of its bits set. */ + r_sizes[r_len - 1] =3D MTE_GRANULES_PER_PAGE - r_sizes[0] - r_len + 2; + + mte_ranges_to_tags(r_tags, r_sizes, r_len, td->tags); + handle =3D compress_decompress_helper(test); + KUNIT_EXPECT_TRUE(test, mte_is_compressed(handle)); +} + +static void mtecomp_dealloc_testdata(struct test_data *td) +{ + kfree(td->tags); + kfree(td->dtags); + kfree(td->r_sizes); + kfree(td->r_tags); +} + +static int mtecomp_test_init(struct kunit *test) +{ + struct test_data *td; + + td =3D kmalloc(sizeof(struct test_data), GFP_KERNEL); + if (!td) + return 1; + td->tags =3D kmalloc(MTE_PAGE_TAG_STORAGE, GFP_KERNEL); + if (!td->tags) + goto error; + td->dtags =3D kmalloc(MTE_PAGE_TAG_STORAGE, GFP_KERNEL); + if (!td->dtags) + goto error; + td->r_len =3D MTE_GRANULES_PER_PAGE; + td->r_sizes =3D kmalloc_array(MTE_GRANULES_PER_PAGE, + sizeof(unsigned short), GFP_KERNEL); + if (!td->r_sizes) + goto error; + td->r_tags =3D kmalloc(MTE_GRANULES_PER_PAGE, GFP_KERNEL); + if (!td->r_tags) + goto error; + test->priv =3D (void *)td; + return 0; +error: + mtecomp_dealloc_testdata(td); + return 1; +} + +static void mtecomp_test_exit(struct kunit *test) +{ + struct test_data *td =3D test->priv; + + mtecomp_dealloc_testdata(td); +} + +static struct kunit_case mtecomp_test_cases[] =3D { + KUNIT_CASE(test_tags_to_ranges_zero), + KUNIT_CASE(test_tags_to_ranges_simple), + KUNIT_CASE(test_tags_to_ranges_repeated), + KUNIT_CASE(test_tag_to_ranges_n), + KUNIT_CASE(test_compress_zero), + KUNIT_CASE(test_compress_nonzero), + KUNIT_CASE(test_two_ranges), + KUNIT_CASE(test_compress_simple), + KUNIT_CASE(test_compress_ranges), + KUNIT_CASE(test_decompress_invalid), + KUNIT_CASE(test_upper_bits), + {} +}; + +static struct kunit_suite mtecomp_test_suite =3D { + .name =3D "mtecomp", + .init =3D mtecomp_test_init, + .exit =3D mtecomp_test_exit, + .test_cases =3D mtecomp_test_cases, +}; +kunit_test_suites(&mtecomp_test_suite); + +MODULE_IMPORT_NS(MTECOMP); +MODULE_LICENSE("GPL"); +MODULE_AUTHOR("Alexander Potapenko "); --=20 2.43.0.472.g3155946c3a-goog From nobody Sun Dec 28 00:48:26 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id AAB27C4332F for ; Thu, 14 Dec 2023 11:07:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1444081AbjLNLHg (ORCPT ); Thu, 14 Dec 2023 06:07:36 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42760 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1443986AbjLNLHP (ORCPT ); Thu, 14 Dec 2023 06:07:15 -0500 Received: from mail-yw1-x114a.google.com (mail-yw1-x114a.google.com [IPv6:2607:f8b0:4864:20::114a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 761001B5 for ; Thu, 14 Dec 2023 03:07:03 -0800 (PST) Received: by mail-yw1-x114a.google.com with SMTP id 00721157ae682-5caf61210e3so89343197b3.0 for ; Thu, 14 Dec 2023 03:07:03 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1702552022; x=1703156822; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=qETeNVW4T3lC++QM3orD9LTmmtp3iyUET5aUKhJnmac=; b=ojei8kpbI+wY1OznPPHL2kaANXhM6JnJeL8IVtH12058RZvTm/bD64zV5wnDJLp/Qa A9eJcQZgHE+ilUO+CXFqSAvIgL1UPf5gRR3sWomh2OxBDma74AvUVUUnaauR6F7M4k3c HPI1CwiTW8sDyIX3WImfL2kv2cs+boLr/BXh798bIaWMSwXiVjwiqeu+0nqKF+JmNr84 UYL5WZ7GKN2jO5g/pkmeNv3rER3pFo0+7/o7Xc8gDnTXLjrceDiGO51zYES0Eaj82fxS nt/tOp1O5tRJydnuwiMtmvNtQvIyn6M4iu/TIDbwBirGdXl5i9THAt9L9bLI8LNQXEnn pgZg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1702552022; x=1703156822; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=qETeNVW4T3lC++QM3orD9LTmmtp3iyUET5aUKhJnmac=; b=aIPhjAMZazVUJRQRCsRVSY5aqn4GiRrmQONGthkJk+qpZ7Yq6j1wbIhacl5wfjuoyw Hu8qzf8a0nD056WP9NsQUdmsgxTP+bVO+h9nKdUCJln4gJR3No8DLhAEbymZQEQIfROL rMtD6mkl7W1/5xp+4VNqXjo0N1144xfiSDOVXcqSqD2KGbDIs4zZgxSwKXo2f7yZ55bg uu5ElxnViq+lwsqSjF6AhxUDVYGFPPh9in6xzTUp0n3dM3F+SaAFZIxeyqdq/2A9Ounl 5FhZWFngvgQLGi2cOS2mC0lSAdqvoE9QwVpaOICtQjAu+WZ1xsklfamAOn2fPI9aX+Eu zo4Q== X-Gm-Message-State: AOJu0Yz5ERAcWU+0YLnuy9EdRSMNo3REiua95y3ASPXbYmfZrIYayAUG GRxHqlN1VANhtmuSfwDRhSiBbb7crzE= X-Google-Smtp-Source: AGHT+IHFo4Qd+bUyHXV7uDnWTfmpROKcqEvV1HIH+ht9/jDqhLREekiPTUW/I5ttnI06QXPQH14EnN1Qqvg= X-Received: from glider.muc.corp.google.com ([2a00:79e0:9c:201:8447:3e89:1f77:ff8a]) (user=glider job=sendgmr) by 2002:a05:690c:a81:b0:5df:4a9b:fb6c with SMTP id ci1-20020a05690c0a8100b005df4a9bfb6cmr109550ywb.3.1702552022462; Thu, 14 Dec 2023 03:07:02 -0800 (PST) Date: Thu, 14 Dec 2023 12:06:38 +0100 In-Reply-To: <20231214110639.2294687-1-glider@google.com> Mime-Version: 1.0 References: <20231214110639.2294687-1-glider@google.com> X-Mailer: git-send-email 2.43.0.472.g3155946c3a-goog Message-ID: <20231214110639.2294687-7-glider@google.com> Subject: [PATCH v10-mte 6/7] arm64: mte: add compression support to mteswap.c From: Alexander Potapenko To: glider@google.com, catalin.marinas@arm.com, will@kernel.org, pcc@google.com, andreyknvl@gmail.com, andriy.shevchenko@linux.intel.com, aleksander.lobakin@intel.com, linux@rasmusvillemoes.dk, yury.norov@gmail.com, alexandru.elisei@arm.com Cc: linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, eugenis@google.com, syednwaris@gmail.com, william.gray@linaro.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Update mteswap.c to perform inline compression of memory tags when possible. If CONFIG_ARM64_MTE_COMP is enabled, mteswap.c will attempt to compress saved tags for a struct page and store them directly in Xarray entry instead of wasting heap space. Soon after booting Android, tag compression saves ~2x memory previously spent by mteswap.c on tag allocations. On a moderately loaded device with ~20% tagged pages, this leads to saving several megabytes of kernel heap: # cat /sys/kernel/debug/mteswap/stats 8 bytes: 102496 allocations, 67302 deallocations 128 bytes: 212234 allocations, 178278 deallocations uncompressed tag storage size: 8851200 compressed tag storage size: 4346368 (statistics collection is introduced in the following patch) Signed-off-by: Alexander Potapenko Reviewed-by: Catalin Marinas --- v10-mte: - added Catalin's Reviewed-by: v9: - as requested by Yury Norov, split off statistics collection into a separate patch - minor fixes v8: - adapt to the new compression API, abandon mteswap_{no,}comp.c - move stats collection to mteswap.c v5: - drop a dead variable from _mte_free_saved_tags() in mteswap_comp.c - ensure MTE compression works with arbitrary page sizes - update patch description v4: - minor code simplifications suggested by Andy Shevchenko, added missing header dependencies - changed compression API names to reflect modifications made to memcomp.h (as suggested by Yury Norov) v3: - Addressed comments by Andy Shevchenko in another patch: - fixed includes order - replaced u64 with unsigned long - added MODULE_IMPORT_NS(MTECOMP) --- arch/arm64/mm/mteswap.c | 19 +++++++++++++++++-- 1 file changed, 17 insertions(+), 2 deletions(-) diff --git a/arch/arm64/mm/mteswap.c b/arch/arm64/mm/mteswap.c index a31833e3ddc54..70f5c8ecd640d 100644 --- a/arch/arm64/mm/mteswap.c +++ b/arch/arm64/mm/mteswap.c @@ -6,6 +6,8 @@ #include #include #include +#include +#include "mtecomp.h" =20 static DEFINE_XARRAY(mte_pages); =20 @@ -17,12 +19,13 @@ void *mte_allocate_tag_storage(void) =20 void mte_free_tag_storage(char *storage) { - kfree(storage); + if (!mte_is_compressed(storage)) + kfree(storage); } =20 int mte_save_tags(struct page *page) { - void *tag_storage, *ret; + void *tag_storage, *compressed_storage, *ret; =20 if (!page_mte_tagged(page)) return 0; @@ -32,6 +35,11 @@ int mte_save_tags(struct page *page) return -ENOMEM; =20 mte_save_page_tags(page_address(page), tag_storage); + compressed_storage =3D mte_compress(tag_storage); + if (compressed_storage) { + mte_free_tag_storage(tag_storage); + tag_storage =3D compressed_storage; + } =20 /* lookup the swap entry.val from the page */ ret =3D xa_store(&mte_pages, page_swap_entry(page).val, tag_storage, @@ -50,13 +58,20 @@ int mte_save_tags(struct page *page) void mte_restore_tags(swp_entry_t entry, struct page *page) { void *tags =3D xa_load(&mte_pages, entry.val); + void *tag_storage =3D NULL; =20 if (!tags) return; =20 if (try_page_mte_tagging(page)) { + if (mte_is_compressed(tags)) { + tag_storage =3D mte_allocate_tag_storage(); + mte_decompress(tags, tag_storage); + tags =3D tag_storage; + } mte_restore_page_tags(page_address(page), tags); set_page_mte_tagged(page); + mte_free_tag_storage(tag_storage); } } =20 --=20 2.43.0.472.g3155946c3a-goog From nobody Sun Dec 28 00:48:26 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id F1EF4C4167D for ; Thu, 14 Dec 2023 11:07:53 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1443918AbjLNLHp (ORCPT ); Thu, 14 Dec 2023 06:07:45 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34320 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1443962AbjLNLHT (ORCPT ); Thu, 14 Dec 2023 06:07:19 -0500 Received: from mail-yw1-x1149.google.com (mail-yw1-x1149.google.com [IPv6:2607:f8b0:4864:20::1149]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 238A3D56 for ; Thu, 14 Dec 2023 03:07:06 -0800 (PST) Received: by mail-yw1-x1149.google.com with SMTP id 00721157ae682-5deda822167so69235527b3.1 for ; Thu, 14 Dec 2023 03:07:06 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1702552025; x=1703156825; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=FKxTzibU43FbKAAwU1cbn+VKrccGDamZlyB7KN8LFoo=; b=nBhImV6Y49TPWVACxCkWZQ+psa3/ar62hzVxDuLZXA7BvoyGuY9s60eqOXJ3amBiuq SwBKJMdg15C6EPwqaE/WR/7aD7FMnyV65WI90xNd3bLqJq8uernNphWl9CnUxOI/TUtJ Zsb83DSZFxvEwAIO9tqiHZxH4E1cMfH/sUNx1tRmI1fbjR/LB4QIS5WmJPDXe7hjAoZu Lbv/gUY6gKgzayQ1gSeJHqo6iLKR9ccd/kJV1SuYDbkBCB7ozL4dM7Q2qu8KSv06Otti ueob3K5sZ+O6EepzM4uGNyvqswBlHmLK+i1DNojUVW065Dw6k7fw6EFUjRP6ThrHl+P2 ukpw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1702552025; x=1703156825; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=FKxTzibU43FbKAAwU1cbn+VKrccGDamZlyB7KN8LFoo=; b=tulWvG9dlkdmys2QMsqtaYlGGNrSLdUqPAAF3aw/cSL8dSd3vDuhVlpBxp4Gi0kEFq pzbCC6hqhztZvm8wOUw26+vib4pxZ1yI+Zxny8qiJlOKC0++zSs8OQoDvuWyYMUdO/pY xczBJx7+Ev6bvoOvcQFvekghvkMaqOzlaD1TuMhHxosOgReBDI2THkTcpC7z7g2BhFB/ CqrJWOj19YtLCR1wp58Jmw2d3mEn/8XDTFSb2z1k/uPft3MbiyvoG3d5BU1qeZ9FcZbG Ua9ubTptTweTWRpEvp9Tnhqdtctey9YUosmOx+inRb8adf0tAnGzVvHTipVhqR0MfaeA Iiqg== X-Gm-Message-State: AOJu0YyRjyySWFVwKF5XRDKiamjXfjN4bQvOgVSqaTW16V8KuAt75cqO RerHqdbODOvP69Zb0/nZGuOgAGTMa0A= X-Google-Smtp-Source: AGHT+IFBw8Rja0Ncdp/+8Jl8ZkbXeAqW7FhNZ926MndywhHb2Vn/iUMG2mrbxSRArQwUykHTFC+G8aHdUrw= X-Received: from glider.muc.corp.google.com ([2a00:79e0:9c:201:8447:3e89:1f77:ff8a]) (user=glider job=sendgmr) by 2002:a05:690c:3389:b0:5e3:e985:18bb with SMTP id fl9-20020a05690c338900b005e3e98518bbmr1990ywb.7.1702552025370; Thu, 14 Dec 2023 03:07:05 -0800 (PST) Date: Thu, 14 Dec 2023 12:06:39 +0100 In-Reply-To: <20231214110639.2294687-1-glider@google.com> Mime-Version: 1.0 References: <20231214110639.2294687-1-glider@google.com> X-Mailer: git-send-email 2.43.0.472.g3155946c3a-goog Message-ID: <20231214110639.2294687-8-glider@google.com> Subject: [PATCH v10-mte 7/7] arm64: mte: implement CONFIG_ARM64_MTE_SWAP_STATS From: Alexander Potapenko To: glider@google.com, catalin.marinas@arm.com, will@kernel.org, pcc@google.com, andreyknvl@gmail.com, andriy.shevchenko@linux.intel.com, aleksander.lobakin@intel.com, linux@rasmusvillemoes.dk, yury.norov@gmail.com, alexandru.elisei@arm.com Cc: linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, eugenis@google.com, syednwaris@gmail.com, william.gray@linaro.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Provide a config to collect the usage statistics for ARM MTE tag compression. This patch introduces allocation/deallocation counters for buffers that were stored uncompressed (and thus occupy 128 bytes of heap plus the Xarray overhead to store a pointer) and those that were compressed into 8-byte pointers (effectively using 0 bytes of heap in addition to the Xarray overhead). The counters are exposed to the userspace via /sys/kernel/debug/mteswap/stats: # cat /sys/kernel/debug/mteswap/stats 8 bytes: 102496 allocations, 67302 deallocations 128 bytes: 212234 allocations, 178278 deallocations uncompressed tag storage size: 8851200 compressed tag storage size: 4346368 Suggested-by: Yury Norov Signed-off-by: Alexander Potapenko Acked-by: Catalin Marinas Reviewed-by: Yury Norov --- This patch was split off from the "arm64: mte: add compression support to mteswap.c" patch (https://lore.kernel.org/linux-arm-kernel/ZUVulBKVYK7cq2rJ@yury-ThinkPad/T/= #m819ec30beb9de53d5c442f7e3247456f8966d88a) v10-mte: - added Catalin's Acked-by: v9: - add this patch, put the stats behind a separate config, mention /sys/kernel/debug/mteswap/stats in the documentation --- .../arch/arm64/mte-tag-compression.rst | 12 +++ arch/arm64/Kconfig | 15 +++ arch/arm64/mm/mteswap.c | 93 ++++++++++++++++++- 3 files changed, 118 insertions(+), 2 deletions(-) diff --git a/Documentation/arch/arm64/mte-tag-compression.rst b/Documentati= on/arch/arm64/mte-tag-compression.rst index 8fe6b51a9db6d..4c25b96f7d4b5 100644 --- a/Documentation/arch/arm64/mte-tag-compression.rst +++ b/Documentation/arch/arm64/mte-tag-compression.rst @@ -145,6 +145,18 @@ Tag compression and decompression implicitly rely on t= he fixed MTE tag size (4 bits) and number of tags per page. Should these values change, the algo= rithm may need to be revised. =20 +Stats +=3D=3D=3D=3D=3D + +When `CONFIG_ARM64_MTE_SWAP_STATS` is enabled, `arch/arm64/mm/mteswap.c` e= xports +usage statistics for tag compression used when swapping tagged pages. The = data +can be accessed via debugfs:: + + # cat /sys/kernel/debug/mteswap/stats + 8 bytes: 10438 allocations, 10417 deallocations + 128 bytes: 26180 allocations, 26179 deallocations + uncompressed tag storage size: 2816 + compressed tag storage size: 128 =20 Programming Interface =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index 6a1397a96f2f0..49a786c7edadd 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -2100,6 +2100,21 @@ config ARM64_MTE_COMP_KUNIT_TEST be compressed into pointer-size values and correctly decompressed afterwards. =20 +config ARM64_MTE_SWAP_STATS + bool "Collect usage statistics of tag compression for swapped MTE tags" + default y + depends on ARM64_MTE && ARM64_MTE_COMP + help + Collect usage statistics for ARM64 MTE tag compression during swapping. + + Adds allocation/deallocation counters for buffers that were stored + uncompressed (and thus occupy 128 bytes of heap plus the Xarray + overhead to store a pointer) and those that were compressed into + 8-byte pointers (effectively using 0 bytes of heap in addition to + the Xarray overhead). + The counters are exposed to the userspace via + /sys/kernel/debug/mteswap/stats. + config ARM64_SVE bool "ARM Scalable Vector Extension support" default y diff --git a/arch/arm64/mm/mteswap.c b/arch/arm64/mm/mteswap.c index 70f5c8ecd640d..1c6c78b9a9037 100644 --- a/arch/arm64/mm/mteswap.c +++ b/arch/arm64/mm/mteswap.c @@ -1,5 +1,6 @@ // SPDX-License-Identifier: GPL-2.0-only =20 +#include #include #include #include @@ -11,16 +12,54 @@ =20 static DEFINE_XARRAY(mte_pages); =20 +enum mteswap_counters { + MTESWAP_CTR_INLINE =3D 0, + MTESWAP_CTR_OUTLINE, + MTESWAP_CTR_SIZE +}; + +#if defined(CONFIG_ARM64_MTE_SWAP_STATS) +static atomic_long_t alloc_counters[MTESWAP_CTR_SIZE]; +static atomic_long_t dealloc_counters[MTESWAP_CTR_SIZE]; + +static void inc_alloc_counter(int kind) +{ + atomic_long_inc(&alloc_counters[kind]); +} + +static void inc_dealloc_counter(int kind) +{ + atomic_long_inc(&dealloc_counters[kind]); +} +#else +static void inc_alloc_counter(int kind) +{ +} + +static void inc_dealloc_counter(int kind) +{ +} +#endif + void *mte_allocate_tag_storage(void) { + void *ret; + /* tags granule is 16 bytes, 2 tags stored per byte */ - return kmalloc(MTE_PAGE_TAG_STORAGE, GFP_KERNEL); + ret =3D kmalloc(MTE_PAGE_TAG_STORAGE, GFP_KERNEL); + if (ret) + inc_alloc_counter(MTESWAP_CTR_OUTLINE); + return ret; } =20 void mte_free_tag_storage(char *storage) { - if (!mte_is_compressed(storage)) + if (!mte_is_compressed(storage)) { kfree(storage); + inc_dealloc_counter(MTESWAP_CTR_OUTLINE); + } else { + inc_dealloc_counter(MTESWAP_CTR_INLINE); + } } =20 int mte_save_tags(struct page *page) @@ -39,6 +78,7 @@ int mte_save_tags(struct page *page) if (compressed_storage) { mte_free_tag_storage(tag_storage); tag_storage =3D compressed_storage; + inc_alloc_counter(MTESWAP_CTR_INLINE); } =20 /* lookup the swap entry.val from the page */ @@ -98,3 +138,52 @@ void mte_invalidate_tags_area(int type) } xa_unlock(&mte_pages); } + +#if defined(CONFIG_ARM64_MTE_SWAP_STATS) +/* DebugFS interface. */ +static int stats_show(struct seq_file *seq, void *v) +{ + unsigned long total_mem_alloc =3D 0, total_mem_dealloc =3D 0; + unsigned long total_num_alloc =3D 0, total_num_dealloc =3D 0; + unsigned long sizes[2] =3D { 8, MTE_PAGE_TAG_STORAGE }; + long alloc, dealloc; + unsigned long size; + int i; + + for (i =3D 0; i < MTESWAP_CTR_SIZE; i++) { + alloc =3D atomic_long_read(&alloc_counters[i]); + dealloc =3D atomic_long_read(&dealloc_counters[i]); + total_num_alloc +=3D alloc; + total_num_dealloc +=3D dealloc; + size =3D sizes[i]; + /* + * Do not count 8-byte buffers towards compressed tag storage + * size. + */ + if (i) { + total_mem_alloc +=3D (size * alloc); + total_mem_dealloc +=3D (size * dealloc); + } + seq_printf(seq, + "%lu bytes:\t%lu allocations,\t%lu deallocations\n", + size, alloc, dealloc); + } + seq_printf(seq, "uncompressed tag storage size:\t%lu\n", + (total_num_alloc - total_num_dealloc) * + MTE_PAGE_TAG_STORAGE); + seq_printf(seq, "compressed tag storage size:\t%lu\n", + total_mem_alloc - total_mem_dealloc); + return 0; +} +DEFINE_SHOW_ATTRIBUTE(stats); + +static int mteswap_init(void) +{ + struct dentry *mteswap_dir; + + mteswap_dir =3D debugfs_create_dir("mteswap", NULL); + debugfs_create_file("stats", 0444, mteswap_dir, NULL, &stats_fops); + return 0; +} +module_init(mteswap_init); +#endif --=20 2.43.0.472.g3155946c3a-goog