From nobody Sat May 30 12:36:06 2026 Received: from mail-pf1-f181.google.com (mail-pf1-f181.google.com [209.85.210.181]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B387D349AEA for ; Fri, 8 May 2026 06:19:30 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.181 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778221178; cv=none; b=cgl7QGuw9CUO/IsNLnaCbr4rfExucVlnV4IKX02CCU4M9JR2W2pyQf99c3dF4De1N7/0Z56cSLkOHMOg7ozr+8R4zBrfI7piD0pddd9b+lpzbFP4L24Ysi4rC+jR+f8H5opzaeIasANp4RxSa4ZDMTFseV1mnkooLoPO8X+VUyw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778221178; c=relaxed/simple; bh=Ky3MF6pCkTG25sf8A7T4Pg2iPrKhDQYwdG8ih2FrHzc=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=RieN3kNacxeKDMsFj4Co/CeoyfgF2skuAouYcWJxhO2ZXINAiTqgLeEgHi65MAPhS7oXOyk0tI7aNNVtVeLkFamPM4pzxNr0KBO5RwGHtzkAOA0T+G5t9JG2a99mU741wzp4CzzEZ9Z1GyApH5GA7hFER67XXFPYvsbSFEikOuE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=mrGIs1Ij; arc=none smtp.client-ip=209.85.210.181 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="mrGIs1Ij" Received: by mail-pf1-f181.google.com with SMTP id d2e1a72fcca58-8367df48711so732185b3a.1 for ; Thu, 07 May 2026 23:19:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1778221166; x=1778825966; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=z+4px8BwYZFhTuuCugdzO8I6K2HioZzSZpj/NFEZaAQ=; b=mrGIs1IjkcNrk2EgkU0YK7vwwV8rNe5vuRI5CVxMd4yGR2fgqA7DihxZEI0lYdLPs5 Jn3KhvkUqFbQAKk3nKwffKP/JIaHR3OdIEy9w68gS1uzl4g1jwMILDT+E2Y84TZr/+uH Wlgmu9DBCzwhbf06MXVgXJnzVx4TKLrJpcJvKomwG5KT7uFL8bC+in4dbyAlMZw+JTdd IzEobLXe+tdtzO5t3rQlKuQ296dE4l33AuOqFS8JA69nb98ZdGq2WfiU7XfXPYuZOySn XRz4cw11MWKZuMvoSSPK2D6YV2+xrCWprhU0jlGDIQSpPpKyekNKk6w9wghmPdl83mOG KL4w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1778221166; x=1778825966; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=z+4px8BwYZFhTuuCugdzO8I6K2HioZzSZpj/NFEZaAQ=; b=rga/VvtakzsaTCWoZ0qNk/3AoWBwQ5oUfF4yeiTYdgvwW2HGsADsUGBR0cVEZR7b/Y rpU2MSVm0hTnyeGa4LuUJ/9rS9s4f+mh1kKCZgSqqATBLULoWTVsWbFiZh7BYChx6S/P ghn0eq4S4ZUKcoK5D31u9VXfYDX6ErY0eIhn4xfIKH8orjz2vHPzStsFi+P5Fn347X2v WF9Qh7BN07TPkkaFJLNiRg1IE4P4DskmQoth47YdKozhE5lX65AYfSiMYJR47qI6MYii 4/6eRIfJ3CYbtvF5nmLjZq8Z6RhGtzLHaxtKj6WWdNxu2I7dVGiI/Mse8i+cZ/AuWMBA tUBg== X-Forwarded-Encrypted: i=1; AFNElJ/SLDlOnhP8b7QNe0RoL3qyx03MYdcPH3W4gZdgbuNlrch9GA8m2/0KePc1FkiARE+uVA7GezM9NmYncIs=@vger.kernel.org X-Gm-Message-State: AOJu0YzCttnBvoe0eGLSE9uwpQS5wk9Km6zIx0IK2LOtjzf6CruEpv7+ lYeh9SU81d4A4xCP/52Ea6Tk6c8KBIURPlyfRip6Q4pS994NCjLor4Sl X-Gm-Gg: AeBDieubWgfA8qTUGr3Pf2EdEDPvZEoOeQc7+yV9nTNm9JY7mf3Pmk92WFbNTsr+rNP FgaCxiXrHQKs3W0Wws9ABoSGj7rCaZrMXnqp4+HxiDYNXKIa3O1ZSTebt4a48/CKamSzcTJkSHB j/xLO5erRtYlVZIUsa+HwC5Sp814KqP6VFey0vgAOilTWgsgC6Y7ftHZZPNGV6VF3scGm4LNG+4 IWbM1BlT6Qi8IBOiFzNl1p642H1EQNb8SCMvfq/QOwKaVFThE843CbRBB9nR7KbSv67u79MsNk9 URdAPi2lrkjEF+nAlA37HdoxENZLsJBmV18aGVwZO/Fzc9pNqQnjdvrlfJWOurOrQlcA4DvbcZM QelLPPKA6Ss4Hp0RYkAqxYBk3USoyJsNUWMyUibcA5by8Q2SV9KK3dBK6YPPUsiJwXFK5oJXsLz SO6Uu6hygWx5W53+jYI5wszznzpuDsuUjLcHToMQ== X-Received: by 2002:a05:6a00:6c9a:b0:82f:98c:1465 with SMTP id d2e1a72fcca58-83a5d681339mr11042157b3a.27.1778221166221; Thu, 07 May 2026 23:19:26 -0700 (PDT) Received: from ubuntu22.mioffice.cn ([43.224.245.232]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-83965a3e3ecsm11308452b3a.19.2026.05.07.23.19.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 07 May 2026 23:19:25 -0700 (PDT) From: Wenchao Hao X-Google-Original-From: Wenchao Hao To: Albert Ou , Alexandre Ghiti , Andrew Morton , Barry Song <21cnbao@gmail.com>, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-riscv@lists.infradead.org, Minchan Kim , Palmer Dabbelt , Paul Walmsley , Sergey Senozhatsky Cc: Wenchao Hao , Wenchao Hao Subject: [RFC PATCH 1/3] mm/zsmalloc: encode class index in obj value for lockless class lookup Date: Fri, 8 May 2026 14:19:08 +0800 Message-Id: <20260508061910.3882831-2-haowenchao@xiaomi.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20260508061910.3882831-1-haowenchao@xiaomi.com> References: <20260508061910.3882831-1-haowenchao@xiaomi.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Encode the size class index (class_idx) into the obj value so that zs_free() can determine the correct size_class without dereferencing the handle->obj->PFN->zpdesc->zspage->class chain under pool->lock. OBJ_INDEX_BITS is over-provisioned on 64-bit systems. For example on arm64 with default chain_size=3D8: OBJ_INDEX_BITS=3D24 but only 10 bits are actually needed for obj_idx. We dynamically compute OBJ_CLASS_BITS as ilog2(ZS_SIZE_CLASSES - 1) + 1 (8 bits for 4K pages, 9 for 64K) and verify at compile time via static_assert that the three fields (PFN + class_idx + obj_idx) fit within BITS_PER_LONG. This encoding is gated by ZS_OBJ_CLASS_IDX, defined only when BITS_PER_LONG >=3D 64. On 32-bit systems the bits do not fit, so the feature is disabled and the original OBJ_INDEX layout is preserved. Split OBJ_INDEX into class_idx and obj_idx: obj: [PFN | class_idx | obj_idx] [_PFN_BITS | OBJ_CLASS_BITS | OBJ_IDX_BITS] class_idx is invariant across page migration (only PFN changes), so a lockless read always yields a valid class_idx. Update obj_to_location(), location_to_obj() and callers accordingly. Add obj_to_class_idx() helper. Adjust ZS_MIN_ALLOC_SIZE to use OBJ_IDX_BITS. Signed-off-by: Wenchao Hao --- mm/zsmalloc.c | 95 +++++++++++++++++++++++++++++++++++++++++++++++---- 1 file changed, 88 insertions(+), 7 deletions(-) diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c index 63128ddb7959..bccadf0a27f2 100644 --- a/mm/zsmalloc.c +++ b/mm/zsmalloc.c @@ -96,11 +96,74 @@ #define CLASS_BITS 8 #define MAGIC_VAL_BITS 8 =20 +/* + * Optionally encode the size class index in the obj value so that + * zs_free() can look up the correct class without holding pool->lock. + * + * Rather than fixing a hard CLASS_BITS constant for the class_idx field, + * we compute the minimum bits needed from the actual number of size class= es + * and the actual maximum obj_idx, then check whether they all fit: + * + * _PFN_BITS + OBJ_CLASS_BITS_NEEDED + OBJ_IDX_BITS_NEEDED <=3D BITS_PER= _LONG + * + * This naturally handles all architectures and PAGE_SIZE configurations: + * + * - 32-bit: BITS_PER_LONG=3D32, sum easily exceeds 32 --> disabled. + * - powerpc64 64K pages: ZS_SIZE_CLASSES=3D257 --> OBJ_CLASS_BITS_NEEDED= =3D9, + * but the sum still fits in 64 bits --> enabled. + * - riscv64 Sv57: _PFN_BITS=3D44, tight but still fits --> enabled. + * + * When enabled, obj layout is: + * + * 63 0 + * +-----------+--------------+-------------+ + * | PFN | class_idx | obj_idx | + * | _PFN_BITS |OBJ_CLASS_BITS| OBJ_IDX_BITS| + * +-----------+--------------+-------------+ + * + * Migration only rewrites PFN; class_idx and obj_idx are invariant, + * so a lockless read of obj always yields a valid class_idx. + */ + +#if BITS_PER_LONG >=3D 64 +#define ZS_OBJ_CLASS_IDX +#endif + +#ifdef ZS_OBJ_CLASS_IDX + +/* ZS_SIZE_CLASSES computed conservatively with original OBJ_INDEX_BITS */ +#define ZS_MIN_ALLOC_SIZE_FULL \ + MAX(32, (CONFIG_ZSMALLOC_CHAIN_SIZE << PAGE_SHIFT >> OBJ_INDEX_BITS)) +#define ZS_SIZE_CLASSES_FULL \ + (DIV_ROUND_UP(PAGE_SIZE - ZS_MIN_ALLOC_SIZE_FULL, \ + PAGE_SIZE >> CLASS_BITS) + 1) + +#define ZS_MAX_OBJ_COUNT_FULL \ + (CONFIG_ZSMALLOC_CHAIN_SIZE * PAGE_SIZE / 32) +#define OBJ_CLASS_BITS_NEEDED (ilog2(ZS_SIZE_CLASSES_FULL - 1) + 1) +#define OBJ_IDX_BITS_NEEDED (ilog2(ZS_MAX_OBJ_COUNT_FULL - 1) + 1) + +static_assert(_PFN_BITS + OBJ_CLASS_BITS_NEEDED + OBJ_IDX_BITS_NEEDED + <=3D BITS_PER_LONG, + "zsmalloc: class_idx + obj_idx + PFN do not fit in obj on this config"); + +#define OBJ_CLASS_BITS OBJ_CLASS_BITS_NEEDED +#define OBJ_IDX_BITS (OBJ_INDEX_BITS - OBJ_CLASS_BITS) +#define OBJ_IDX_MASK ((_AC(1, UL) << OBJ_IDX_BITS) - 1) +#define OBJ_CLASS_MASK ((_AC(1, UL) << OBJ_CLASS_BITS) - 1) + +#else /* !ZS_OBJ_CLASS_IDX */ + +#define OBJ_IDX_BITS OBJ_INDEX_BITS +#define OBJ_IDX_MASK OBJ_INDEX_MASK + +#endif /* ZS_OBJ_CLASS_IDX */ + #define ZS_MAX_PAGES_PER_ZSPAGE (_AC(CONFIG_ZSMALLOC_CHAIN_SIZE, UL)) =20 /* ZS_MIN_ALLOC_SIZE must be multiple of ZS_ALIGN */ #define ZS_MIN_ALLOC_SIZE \ - MAX(32, (ZS_MAX_PAGES_PER_ZSPAGE << PAGE_SHIFT >> OBJ_INDEX_BITS)) + MAX(32, (ZS_MAX_PAGES_PER_ZSPAGE << PAGE_SHIFT >> OBJ_IDX_BITS)) /* each chunk includes extra space to keep handle */ #define ZS_MAX_ALLOC_SIZE PAGE_SIZE =20 @@ -722,7 +785,7 @@ static void obj_to_location(unsigned long obj, struct z= pdesc **zpdesc, unsigned int *obj_idx) { *zpdesc =3D pfn_zpdesc(obj >> OBJ_INDEX_BITS); - *obj_idx =3D (obj & OBJ_INDEX_MASK); + *obj_idx =3D (obj & OBJ_IDX_MASK); } =20 static void obj_to_zpdesc(unsigned long obj, struct zpdesc **zpdesc) @@ -730,17 +793,29 @@ static void obj_to_zpdesc(unsigned long obj, struct z= pdesc **zpdesc) *zpdesc =3D pfn_zpdesc(obj >> OBJ_INDEX_BITS); } =20 +#ifdef ZS_OBJ_CLASS_IDX +static unsigned int obj_to_class_idx(unsigned long obj) +{ + return (obj >> OBJ_IDX_BITS) & OBJ_CLASS_MASK; +} +#endif + /** - * location_to_obj - get obj value encoded from (, ) + * location_to_obj - encode (, , ) into obj va= lue * @zpdesc: zpdesc object resides in zspage * @obj_idx: object index + * @class_idx: size class index (used only when ZS_OBJ_CLASS_IDX is define= d) */ -static unsigned long location_to_obj(struct zpdesc *zpdesc, unsigned int o= bj_idx) +static unsigned long location_to_obj(struct zpdesc *zpdesc, unsigned int o= bj_idx, + unsigned int class_idx) { unsigned long obj; =20 obj =3D zpdesc_pfn(zpdesc) << OBJ_INDEX_BITS; - obj |=3D obj_idx & OBJ_INDEX_MASK; +#ifdef ZS_OBJ_CLASS_IDX + obj |=3D (unsigned long)(class_idx & OBJ_CLASS_MASK) << OBJ_IDX_BITS; +#endif + obj |=3D obj_idx & OBJ_IDX_MASK; =20 return obj; } @@ -1276,7 +1351,7 @@ static unsigned long obj_malloc(struct zs_pool *pool, kunmap_local(vaddr); mod_zspage_inuse(zspage, 1); =20 - obj =3D location_to_obj(m_zpdesc, obj); + obj =3D location_to_obj(m_zpdesc, obj, zspage->class); record_obj(handle, obj); =20 return obj; @@ -1762,7 +1837,13 @@ static int zs_page_migrate(struct page *newpage, str= uct page *page, =20 old_obj =3D handle_to_obj(handle); obj_to_location(old_obj, &dummy, &obj_idx); - new_obj =3D (unsigned long)location_to_obj(newzpdesc, obj_idx); +#ifdef ZS_OBJ_CLASS_IDX + new_obj =3D (unsigned long)location_to_obj(newzpdesc, + obj_idx, obj_to_class_idx(old_obj)); +#else + new_obj =3D (unsigned long)location_to_obj(newzpdesc, + obj_idx, 0); +#endif record_obj(handle, new_obj); } } --=20 2.34.1 From nobody Sat May 30 12:36:06 2026 Received: from mail-pf1-f171.google.com (mail-pf1-f171.google.com [209.85.210.171]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B8E0B3612F0 for ; Fri, 8 May 2026 06:19:33 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.171 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778221181; cv=none; b=fq+1K28+675VzclXlRzG5pobcpOUQdPRbvq961+CnlsrWLSfUDmixV8YXbNRx7463f5ZZlEofbVCIayjwYlwAIhgkjn/nXxYvxPR4DxubjrOhMhd9mQxeve/bxb033VN4GneH3LXGGlqFMklkgHN+1JnnGS9wtfMPs7B/UVoiIM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778221181; c=relaxed/simple; bh=B58o459sDOTyTduEFubVWy2QgqLQOoTpVRLBwAfp9AQ=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=QnKq1QOPW0fh69nbbXRVIIyFOdY2DfYG0Fd5ggfrmaJZM0EPxzUPv8WhRDzsUHWFeCtjQwxtr2Y2wa9evPbI3QZ0aMANXqO2OrUmUqtjCvKVAvwgL/hDH5cMvIfmTsqrPcsAiFSg27XE9jO93MV9cTwJuXaWc4PzniuTiBdmpWY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=sE3GCURh; arc=none smtp.client-ip=209.85.210.171 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="sE3GCURh" Received: by mail-pf1-f171.google.com with SMTP id d2e1a72fcca58-8367df48711so732258b3a.1 for ; Thu, 07 May 2026 23:19:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1778221170; x=1778825970; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=6yd60DVAg2KX/emKeF35pZxyEzN8nWOKFEbn6cCyQzU=; b=sE3GCURhAh+ZiywNDURJ/MXZOE9d+x82YIb91Vy1151YxA7moVr42BUBpzlFcGrS8P ohiumdnLKN5CvdzE1remtisS/63P7fMZFHQ3oMip7WDcp7kuopgFRQILzGaZEeQEjonH 3iMUv4y6kwRzhA5RnGMWa48lzTbRAWOjUEC6dm0NJI0SARqY62haXF1x3w5zHymEMgMx kzs7KTyn5cUqJwiCVX0YvJ4JNBVCjN56qAJn2vBbU1GOrDyXw262ElDdhb8fqKEb7hLm uI6DzklSSroNw1c1OZMlOUaZNxj2t4DnFRNoFWRH+IL5pkKCkLquab6igd2pXLYZVhq0 JVaA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1778221170; x=1778825970; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=6yd60DVAg2KX/emKeF35pZxyEzN8nWOKFEbn6cCyQzU=; b=cudIpEbZdprQE3dR64d7G1gU2T/uAKHJWQ+bp60LyFU1+6RO1x0qZ+jPgWC59nXT1/ ohJRm+9lW8s2zWZTCfknj0niELG1SmVTvs6JRuF6RQx7AwgxZmbP6LNP02w0CMtwKI8p 9nOk+P41bCAOspKrTDSQFafvYVmxHIhOaq06PU0w9jTuxfUT5uw+I4hg1vJypCsRWVlI EqzSHBn/L9m2QpYDy9dW8jmHljxh/RRcUyRo5XItVwCFjB9J4ljP58bydPjmuP5l2rjq Fs6LOjSeDH07Q/8TvG3dftbHmWTZwDL/gGf0+IPO4HBPmj0uE4ZY+bRzXHHGAi3uhrR7 oBIw== X-Forwarded-Encrypted: i=1; AFNElJ/iGG8qqpCGFBjrNTI8j6RCT89ldwnB25AVZaZqVm5Z6IURPu6gH+kkvnZqIfayNt9FzFCcKc3ZA+qkfJc=@vger.kernel.org X-Gm-Message-State: AOJu0YyR9bd87a4+Zq7XKSMqnTAX2DVqu03AwfRleYvp0tTT9BXGaj7H Dr5ZPE0xi25gh+qhAae4UmfK60/35wTN9o4EzK03ZIzQjSRyGaL5QbEG X-Gm-Gg: AeBDies0D5m8pYW6/32kZXAUEGaeAWp2bGR/NDVmlpLXjMAVUYBVf9t6NQGhZuAQ7mv 0580uX0GT5YYI+QXGbnmJmb//wsN5XwbEier0unQKVf8nH+TYVVp8BRlrPh7SvjKJpxGozpAFev /dw9mw8dfcQiHFtfUPZ6azCgAvbveDdgKLUqAuuFNQrwkbsYK7xftaYVQ4qgw+g0qTdirlZLz29 0jEWCZ3i7I7vojdphS6qtYDqJ0tikZAkwtDO+3O5whNMpmLA0wgXJdSn1Vt4ipEhV82wuCUW7bJ jL00WxZ0rozVl1+l80/pQBT0ImW8uyC0XHruK/3JAoeWS0QphnW2JwVNQdgk4IB9CMJktXJ9UMl 7m6qke9HlLm1FGEPKnFCwnuS9ho4lyLWCmIkzY9FZVD2Tx4KCjdX995G7SgBPAahE8xx6BJBegi +WuTpqTyiATbAhnyQZCzYrR6Enit8wcbw2qNGdnG73+2ZZQKT3 X-Received: by 2002:a05:6300:8b0f:b0:3a2:f14a:4287 with SMTP id adf61e73a8af0-3aa5a735addmr12419479637.6.1778221170292; Thu, 07 May 2026 23:19:30 -0700 (PDT) Received: from ubuntu22.mioffice.cn ([43.224.245.232]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-83965a3e3ecsm11308452b3a.19.2026.05.07.23.19.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 07 May 2026 23:19:29 -0700 (PDT) From: Wenchao Hao X-Google-Original-From: Wenchao Hao To: Albert Ou , Alexandre Ghiti , Andrew Morton , Barry Song <21cnbao@gmail.com>, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-riscv@lists.infradead.org, Minchan Kim , Palmer Dabbelt , Paul Walmsley , Sergey Senozhatsky Cc: Wenchao Hao , Wenchao Hao Subject: [RFC PATCH 2/3] mm/zsmalloc: remove pool->lock from zs_free on 64-bit systems Date: Fri, 8 May 2026 14:19:09 +0800 Message-Id: <20260508061910.3882831-3-haowenchao@xiaomi.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20260508061910.3882831-1-haowenchao@xiaomi.com> References: <20260508061910.3882831-1-haowenchao@xiaomi.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" With class_idx now encoded in the obj value (ZS_OBJ_CLASS_IDX), zs_free() no longer needs pool->lock to locate the size class on 64-bit systems. The class_idx is invariant across page migration (only PFN changes), and 64-bit aligned reads are atomic, so a lockless read of the handle always yields a valid class_idx. After acquiring class->lock (which blocks concurrent migration), the handle is re-read to obtain a stable PFN for the actual free operation. This eliminates rwlock read-side contention between zs_free() and page migration/compaction, improving zs_free() scalability on multi-core systems. On 32-bit systems (ZS_OBJ_CLASS_IDX not defined), the original pool->lock path is preserved. Signed-off-by: Wenchao Hao --- mm/zsmalloc.c | 23 +++++++++++++++++++++-- 1 file changed, 21 insertions(+), 2 deletions(-) diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c index bccadf0a27f2..47ec0414ce9e 100644 --- a/mm/zsmalloc.c +++ b/mm/zsmalloc.c @@ -21,6 +21,10 @@ * pool->lock * class->lock * zspage->lock + * + * On 64-bit systems with ZS_OBJ_CLASS_IDX enabled, zs_free() does not + * take pool->lock; it extracts class_idx from the obj encoding with a + * lockless read, then re-reads obj under class->lock. */ =20 #include @@ -1467,10 +1471,24 @@ void zs_free(struct zs_pool *pool, unsigned long ha= ndle) if (IS_ERR_OR_NULL((void *)handle)) return; =20 +#ifdef ZS_OBJ_CLASS_IDX + /* + * The class_idx encoded in obj is invariant across migration + * (only PFN changes), and the read of *(unsigned long *)handle + * is atomic on 64-bit, so we can determine the correct class + * without holding pool->lock. + */ + obj =3D handle_to_obj(handle); + class =3D pool->size_class[obj_to_class_idx(obj)]; + spin_lock(&class->lock); /* - * The pool->lock protects the race with zpage's migration - * so it's safe to get the page from handle. + * Re-read under class->lock: migration also acquires class->lock, + * so the obj value is now stable and the PFN is valid. */ + obj =3D handle_to_obj(handle); + obj_to_zpdesc(obj, &f_zpdesc); + zspage =3D get_zspage(f_zpdesc); +#else read_lock(&pool->lock); obj =3D handle_to_obj(handle); obj_to_zpdesc(obj, &f_zpdesc); @@ -1478,6 +1496,7 @@ void zs_free(struct zs_pool *pool, unsigned long hand= le) class =3D zspage_class(pool, zspage); spin_lock(&class->lock); read_unlock(&pool->lock); +#endif =20 class_stat_sub(class, ZS_OBJS_INUSE, 1); obj_free(class->size, obj); --=20 2.34.1 From nobody Sat May 30 12:36:06 2026 Received: from mail-pf1-f169.google.com (mail-pf1-f169.google.com [209.85.210.169]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6EB61366822 for ; Fri, 8 May 2026 06:19:38 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.169 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778221184; cv=none; b=TEZDHQdZtYgWfoXbw+OMj4oEngEGcbplLoD+d9EGExdSQ6WtOPH/SBwIp/DWIHL5dR0f+s64Fc3phMU3+/TEop7y0wDJDa7qnrAxKxjepxgkKnj/+BDJRqiQM4G+6+eL1CRxhEBp7R3wFZHfl7VAxgp22WxR34hULLNSLtaZafo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778221184; c=relaxed/simple; bh=w2ByCFa9J6Rd5RzgDV1S9rqSl/zQwtiqFC7GwpLBKac=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=rzG0fubG/DfFTeNVigA5bv88yW1we6GUks5OpZV9zvhYWtvsO1DIGT0EVtnVtutKWYBF6ZknrBt1MFs4VrgvN9v0T3uh5bsWr8X9pobqrTwpcxGkF1Wa1WDku3Gt8CDj+PkWVzjpMy0crLmbBj+O+G2DYdmIRMbwQimr41p6BsY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=VObYBfxg; arc=none smtp.client-ip=209.85.210.169 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="VObYBfxg" Received: by mail-pf1-f169.google.com with SMTP id d2e1a72fcca58-83659d38e38so739238b3a.1 for ; Thu, 07 May 2026 23:19:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1778221175; x=1778825975; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=HQGzo7BAtPBBp7UrgMr8aErap5KHDpu3IcptfAdP3BM=; b=VObYBfxgSVIc1XCzmzY2ri2hBb9weLnwUCGMuPrrHoSPI8jKNiI4z4QsF5XZamekIJ 2ZxCg4f/nTA5TKF8U3051VdwwBwAHYYepiiXnifDTJRI0v752Yek+Qc9Bh3XMEhU8kBZ E1l9MzZXS3YcMLeR5eiU8Ne2294k3QPCy5N06FNmNhTxH+dz/KwObxlUBZ3aOL1yF4ST L/c2YenDk2rWAiRkxtv/iLmqvWbGNF8wOdlGG9tufW1l+bra2YF2dphdBl/VTg0ofQxG dvNiJnHfcGNKvxoEC3l6+qJLUljm7OFsB5toQrUIlp6tRDlZeMoIuq3AzqSOIder2kz9 Bmmg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1778221175; x=1778825975; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=HQGzo7BAtPBBp7UrgMr8aErap5KHDpu3IcptfAdP3BM=; b=Q1sZ5x6k18r3YJ5eidJz7feJHEqJQXNb49HfoCh9VccFugwQP2u80Fik3QFY2bw/G3 26Vnq2eVdutklkBcc/z5DKV8MlopufVJpL9I2mBefsVNS2z7xzjeHZSms0CqklRIB6+4 DxxpFH8Efem8E05AXQiTaZsM/E0acfDatRSRly9HEOEselUPIHKdGNDnEQhRYEz5ipGR PknVMpOsScmn7mp3XGVvsiPPIQN97yxO6HBUIRyekRnKK0KRRCeG0Xqkd96x0WiaIFp8 6GRDLiblnI/xoNmrrbctBAmKH8Kaka9P9IVF6IuyfFVF5dSQpfbYaV4QrNJXRbvexYtg emMw== X-Forwarded-Encrypted: i=1; AFNElJ/cT78mQdtdB1lcesEe6BIAmCj2yUkA2Vgs4ZA4sP0TQkrNadzSyYdBjXylMMibc0ssEHrgvw+XysSdIu4=@vger.kernel.org X-Gm-Message-State: AOJu0YyFH9omvKu/ZEVCZMPBarv/mgob2oeKWTaT7xJL3XLXdrlA3LpD YTrj+UmcYSW3OfmIXpk2xsqXT/jgmY9GrR0eHZV+la+icqsoIDKN6WMd X-Gm-Gg: AeBDieuMPcKH7oZd96qBqS9LLnzXWR/QUlh/5nJfJ3zvGlHxO0AUvYjB/kWvYW/Cewb 10NoVmGjGLcX6cJzrdMx6RQ46dn4VuuBS8OvW0kplAjJUIsBTm4QKs1wyr9+LqLs54AztC1lt4E Dvrsra/PManJJpkuUhGMvUw6YK1NxPFx3CPR7k4xcfBIT2PR5YxeofeCVAogNRx3SKB4K8Yd0Yx XK6gegz80AZDCfSTMQOk5M2g+T6xbHgyydbsKGjpuNeYOyfDZFRCbP7+ANJJfszJRz5RlkOGriG BOPEz7VIKQXwahrv6VSN+ItSfMasLKGw8I3nBkfyZFcAWlWXDV/3dtVVglsSaBWU1bs2w7cLp0v pZGs1uJgidojuBWKDQXjNF6yonTs1Wtc9ogmyt2Z6EwOQZh6XUL/VoRs/fh989910rT/6qCt3jb qda288APc56KoVmkPysGvCJXG6D0Of+RID/Nrh9UK9RNPt2dU5 X-Received: by 2002:a05:6a00:2e90:b0:82f:b0:28f0 with SMTP id d2e1a72fcca58-83a5d96f560mr11390545b3a.34.1778221174987; Thu, 07 May 2026 23:19:34 -0700 (PDT) Received: from ubuntu22.mioffice.cn ([43.224.245.232]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-83965a3e3ecsm11308452b3a.19.2026.05.07.23.19.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 07 May 2026 23:19:34 -0700 (PDT) From: Wenchao Hao X-Google-Original-From: Wenchao Hao To: Albert Ou , Alexandre Ghiti , Andrew Morton , Barry Song <21cnbao@gmail.com>, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-riscv@lists.infradead.org, Minchan Kim , Palmer Dabbelt , Paul Walmsley , Sergey Senozhatsky Cc: Wenchao Hao , Xueyuan Chen , Wenchao Hao Subject: [RFC PATCH 3/3] mm/zsmalloc: drop class lock before freeing zspage Date: Fri, 8 May 2026 14:19:10 +0800 Message-Id: <20260508061910.3882831-4-haowenchao@xiaomi.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20260508061910.3882831-1-haowenchao@xiaomi.com> References: <20260508061910.3882831-1-haowenchao@xiaomi.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Xueyuan Chen Currently in zs_free(), the class->lock is held until the zspage is completely freed and the counters are updated. However, freeing pages back to the buddy allocator requires acquiring the zone lock. Under heavy memory pressure, zone lock contention can be severe. When this happens, the CPU holding the class->lock will stall waiting for the zone lock, thereby blocking all other CPUs attempting to acquire the same class->lock. This patch shrinks the critical section of the class->lock to reduce lock contention. By moving the actual page freeing process outside the class->lock, we can improve the concurrency performance of zs_free(). Testing on the RADXA O6 platform shows that with 12 CPUs concurrently performing zs_free() operations, the execution time is reduced by 20%. Signed-off-by: Xueyuan Chen Signed-off-by: Wenchao Hao --- mm/zsmalloc.c | 28 ++++++++++++++++++++++------ 1 file changed, 22 insertions(+), 6 deletions(-) diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c index 47ec0414ce9e..4b01fb215b19 100644 --- a/mm/zsmalloc.c +++ b/mm/zsmalloc.c @@ -880,13 +880,10 @@ static int trylock_zspage(struct zspage *zspage) return 0; } =20 -static void __free_zspage(struct zs_pool *pool, struct size_class *class, - struct zspage *zspage) +static inline void __free_zspage_lockless(struct zs_pool *pool, struct zsp= age *zspage) { struct zpdesc *zpdesc, *next; =20 - assert_spin_locked(&class->lock); - VM_BUG_ON(get_zspage_inuse(zspage)); VM_BUG_ON(zspage->fullness !=3D ZS_INUSE_RATIO_0); =20 @@ -902,7 +899,13 @@ static void __free_zspage(struct zs_pool *pool, struct= size_class *class, } while (zpdesc !=3D NULL); =20 cache_free_zspage(zspage); +} =20 +static void __free_zspage(struct zs_pool *pool, struct size_class *class, + struct zspage *zspage) +{ + assert_spin_locked(&class->lock); + __free_zspage_lockless(pool, zspage); class_stat_sub(class, ZS_OBJS_ALLOCATED, class->objs_per_zspage); atomic_long_sub(class->pages_per_zspage, &pool->pages_allocated); } @@ -1467,6 +1470,7 @@ void zs_free(struct zs_pool *pool, unsigned long hand= le) unsigned long obj; struct size_class *class; int fullness; + struct zspage *zspage_to_free =3D NULL; =20 if (IS_ERR_OR_NULL((void *)handle)) return; @@ -1502,10 +1506,22 @@ void zs_free(struct zs_pool *pool, unsigned long ha= ndle) obj_free(class->size, obj); =20 fullness =3D fix_fullness_group(class, zspage); - if (fullness =3D=3D ZS_INUSE_RATIO_0) - free_zspage(pool, class, zspage); + if (fullness =3D=3D ZS_INUSE_RATIO_0) { + if (trylock_zspage(zspage)) { + remove_zspage(class, zspage); + class_stat_sub(class, ZS_OBJS_ALLOCATED, + class->objs_per_zspage); + zspage_to_free =3D zspage; + } else + kick_deferred_free(pool); + } =20 spin_unlock(&class->lock); + + if (likely(zspage_to_free)) { + __free_zspage_lockless(pool, zspage_to_free); + atomic_long_sub(class->pages_per_zspage, &pool->pages_allocated); + } cache_free_handle(handle); } EXPORT_SYMBOL_GPL(zs_free); --=20 2.34.1