From nobody Fri Oct 3 07:42:42 2025 Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.223.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 31D133002BF for ; Wed, 3 Sep 2025 13:00:11 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=195.135.223.131 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1756904414; cv=none; b=eNWk9awm2GyPfZLA/20nz+bHCq+fLq4CtXsXkRehfcJL/MMkCOSchQaMX6ugOum2lyVOsUiUsATAd7+5vstVnAFzCyae4CSug1B2QseNYEMQWEnIRi5mfw2la0eJyP/bElWXEatCGJnkaXNKrDE89FHm5mklvgr9mWg2T/HHE6o= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1756904414; c=relaxed/simple; bh=Cnm5+lV8JYbFrvis7+1UCE3xAhZydUfyraB/zSOvzNU=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=d9macgt/fPeTILVayv3HI5+9JQlTTsVJLvp/ow0XRz2phaCUsu2wDZ2LXTEPG7jhwxqSMeDc7wBM1X5QRqtOLP9sr+xq8mo81x8xJ+Jh6tz39AkcOI07iqrnchtKf0dNwIV8vfLEzbyew6HbNH08cqpTCWXK46gen0RHoVCaSTA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=suse.cz; spf=pass smtp.mailfrom=suse.cz; dkim=pass (1024-bit key) header.d=suse.cz header.i=@suse.cz header.b=arrlLRSN; dkim=permerror (0-bit key) header.d=suse.cz header.i=@suse.cz header.b=Dt+VVdAf; dkim=pass (1024-bit key) header.d=suse.cz header.i=@suse.cz header.b=EFlow2CE; dkim=permerror (0-bit key) header.d=suse.cz header.i=@suse.cz header.b=oiNOm3uq; arc=none smtp.client-ip=195.135.223.131 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=suse.cz Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=suse.cz Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=suse.cz header.i=@suse.cz header.b="arrlLRSN"; dkim=permerror (0-bit key) header.d=suse.cz header.i=@suse.cz header.b="Dt+VVdAf"; dkim=pass (1024-bit key) header.d=suse.cz header.i=@suse.cz header.b="EFlow2CE"; dkim=permerror (0-bit key) header.d=suse.cz header.i=@suse.cz header.b="oiNOm3uq" Received: from imap1.dmz-prg2.suse.org (imap1.dmz-prg2.suse.org [IPv6:2a07:de40:b281:104:10:150:64:97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 3C4771F460; Wed, 3 Sep 2025 13:00:09 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1756904410; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=04+J/P3cAQAe3/G2iQS1u1ZbTuFSibWkDWn/mbh0R6o=; b=arrlLRSNCRL6ztrX4z+FD1hSkyyR9ZqSZNGG9Ed+HwGctCLDYHwoF/xJXBgSnpXh8ZYbHa ImQWKKzYk+O3SFPxoA44W6RoOKd22MkH39Wa/SNLF1GLnCqBN3HKYxgVz+wQ3849+Kivo6 QDOrNC8qkUlCSZbqJeGL9B+JCJHw2ZA= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1756904410; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=04+J/P3cAQAe3/G2iQS1u1ZbTuFSibWkDWn/mbh0R6o=; b=Dt+VVdAf1kIxpkQBwHp5ohN73Uh9pPEAROl9LO8dyW7F8E9ilLpImzRQORnm9urOS2kZqA zEW7Pi2l0WmCC0AA== Authentication-Results: smtp-out2.suse.de; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=EFlow2CE; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=oiNOm3uq DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1756904409; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=04+J/P3cAQAe3/G2iQS1u1ZbTuFSibWkDWn/mbh0R6o=; b=EFlow2CECEAnA7D8X1e5og3e94SIX/I73jq5GNahYqNxIwOTEctVEt8z4kzvQn0rNE7ywJ Nx8GvUXnF9YbrZ771Q62/SM6ZLBb7jW6aZNiaqvuYh5ylYB6PFUSU+HLNjTNWQc5n2Ho6+ yESXtGfE/J7n1Z4NbG0RqzkObpqorXA= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1756904409; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=04+J/P3cAQAe3/G2iQS1u1ZbTuFSibWkDWn/mbh0R6o=; b=oiNOm3uqjFRngvdN1r9enuok7DnUulcXMnJLb7Th9LxevZPfHOm3qhKwdrMH2qgHABx0pB qf+yX15kU2XL36CQ== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id 2097613A94; Wed, 3 Sep 2025 13:00:09 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id CJWwB9k7uGitOAAAD6G6ig (envelope-from ); Wed, 03 Sep 2025 13:00:09 +0000 From: Vlastimil Babka Date: Wed, 03 Sep 2025 14:59:43 +0200 Subject: [PATCH v7 01/21] locking/local_lock: Expose dep_map in local_trylock_t. Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20250903-slub-percpu-caches-v7-1-71c114cdefef@suse.cz> References: <20250903-slub-percpu-caches-v7-0-71c114cdefef@suse.cz> In-Reply-To: <20250903-slub-percpu-caches-v7-0-71c114cdefef@suse.cz> To: Suren Baghdasaryan , "Liam R. Howlett" , Christoph Lameter , David Rientjes Cc: Roman Gushchin , Harry Yoo , Uladzislau Rezki , Sidhartha Kumar , linux-mm@kvack.org, linux-kernel@vger.kernel.org, rcu@vger.kernel.org, maple-tree@lists.infradead.org, vbabka@suse.cz, Alexei Starovoitov , Sebastian Andrzej Siewior X-Mailer: b4 0.14.2 X-Spam-Level: X-Spam-Flag: NO X-Rspamd-Queue-Id: 3C4771F460 X-Rspamd-Action: no action X-Rspamd-Server: rspamd1.dmz-prg2.suse.org X-Spamd-Result: default: False [-4.51 / 50.00]; BAYES_HAM(-3.00)[100.00%]; NEURAL_HAM_LONG(-1.00)[-1.000]; R_DKIM_ALLOW(-0.20)[suse.cz:s=susede2_rsa,suse.cz:s=susede2_ed25519]; NEURAL_HAM_SHORT(-0.20)[-1.000]; MIME_GOOD(-0.10)[text/plain]; MX_GOOD(-0.01)[]; DKIM_SIGNED(0.00)[suse.cz:s=susede2_rsa,suse.cz:s=susede2_ed25519]; FUZZY_RATELIMITED(0.00)[rspamd.com]; RBL_SPAMHAUS_BLOCKED_OPENRESOLVER(0.00)[2a07:de40:b281:104:10:150:64:97:from]; ARC_NA(0.00)[]; RCPT_COUNT_TWELVE(0.00)[15]; TO_MATCH_ENVRCPT_ALL(0.00)[]; MIME_TRACE(0.00)[0:+]; SPAMHAUS_XBL(0.00)[2a07:de40:b281:104:10:150:64:97:from]; FREEMAIL_ENVRCPT(0.00)[gmail.com]; FREEMAIL_CC(0.00)[linux.dev,oracle.com,gmail.com,kvack.org,vger.kernel.org,lists.infradead.org,suse.cz,kernel.org,linutronix.de]; RCVD_TLS_ALL(0.00)[]; RCVD_COUNT_TWO(0.00)[2]; FROM_EQ_ENVFROM(0.00)[]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; MID_RHS_MATCH_FROM(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; RECEIVED_SPAMHAUS_BLOCKED_OPENRESOLVER(0.00)[2a07:de40:b281:106:10:150:64:167:received]; DKIM_TRACE(0.00)[suse.cz:+]; R_RATELIMIT(0.00)[to_ip_from(RLfsjnp7neds983g95ihcnuzgq)]; DBL_BLOCKED_OPENRESOLVER(0.00)[linutronix.de:email,suse.cz:dkim,suse.cz:mid,suse.cz:email,imap1.dmz-prg2.suse.org:rdns,imap1.dmz-prg2.suse.org:helo] X-Spam-Score: -4.51 From: Alexei Starovoitov lockdep_is_held() macro assumes that "struct lockdep_map dep_map;" is a top level field of any lock that participates in LOCKDEP. Make it so for local_trylock_t. Reviewed-by: Sebastian Andrzej Siewior Signed-off-by: Alexei Starovoitov Signed-off-by: Vlastimil Babka Reviewed-by: Harry Yoo --- include/linux/local_lock_internal.h | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/include/linux/local_lock_internal.h b/include/linux/local_lock= _internal.h index d80b5306a2c0ccf95a3405b6b947b5f1f9a3bd38..949de37700dbc10feafc06d0b52= 382cf2e00c694 100644 --- a/include/linux/local_lock_internal.h +++ b/include/linux/local_lock_internal.h @@ -17,7 +17,10 @@ typedef struct { =20 /* local_trylock() and local_trylock_irqsave() only work with local_tryloc= k_t */ typedef struct { - local_lock_t llock; +#ifdef CONFIG_DEBUG_LOCK_ALLOC + struct lockdep_map dep_map; + struct task_struct *owner; +#endif u8 acquired; } local_trylock_t; =20 @@ -31,7 +34,7 @@ typedef struct { .owner =3D NULL, =20 # define LOCAL_TRYLOCK_DEBUG_INIT(lockname) \ - .llock =3D { LOCAL_LOCK_DEBUG_INIT((lockname).llock) }, + LOCAL_LOCK_DEBUG_INIT(lockname) =20 static inline void local_lock_acquire(local_lock_t *l) { @@ -81,7 +84,7 @@ do { \ local_lock_debug_init(lock); \ } while (0) =20 -#define __local_trylock_init(lock) __local_lock_init(lock.llock) +#define __local_trylock_init(lock) __local_lock_init((local_lock_t *)lock) =20 #define __spinlock_nested_bh_init(lock) \ do { \ --=20 2.51.0 From nobody Fri Oct 3 07:42:42 2025 Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.223.130]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D50CE2FDC3D for ; Wed, 3 Sep 2025 13:00:11 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=195.135.223.130 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1756904413; cv=none; b=av3dF+qQjyThC1Ty9nh+m3hvluWwZUaoJDC2ul8N1yyII8G6tovvc7KKZB1ppLOlO0HjyPfqp7rOP121ThFcLudIa5GzgFJo6w+dW1LoJJs0URTJYQkFF5PyJ9kCAP7w5ARD5ngI//StbPtI3cmTH4qe+DPaMMlmRKHCCciiwok= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1756904413; c=relaxed/simple; bh=DO+/b1fm8eiQbtTkvGgyjEpChZcHLnwFFBNSKp6Po4I=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=ICR8YHp43J8+piETf0HMe+apfMAbj3+rOx/dObzIw1QwWD+3kitxFhCtmHZz1qoGN+BBreMluFQ1StZgT6ItJyZaMOrJtH2kbUis4l4JIFJoSGLMMHaejDQBxgpe813QuYqhRTVrJ3ZsxY62rYpZ4RIcM+QMrEWPiHiNHvnqGng= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=suse.cz; spf=pass smtp.mailfrom=suse.cz; dkim=pass (1024-bit key) header.d=suse.cz header.i=@suse.cz header.b=BLRR/Z63; dkim=permerror (0-bit key) header.d=suse.cz header.i=@suse.cz header.b=yQsqdA6a; dkim=pass (1024-bit key) header.d=suse.cz header.i=@suse.cz header.b=TIZH5ZxK; dkim=permerror (0-bit key) header.d=suse.cz header.i=@suse.cz header.b=mRKI7iRi; arc=none smtp.client-ip=195.135.223.130 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=suse.cz Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=suse.cz Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=suse.cz header.i=@suse.cz header.b="BLRR/Z63"; dkim=permerror (0-bit key) header.d=suse.cz header.i=@suse.cz header.b="yQsqdA6a"; dkim=pass (1024-bit key) header.d=suse.cz header.i=@suse.cz header.b="TIZH5ZxK"; dkim=permerror (0-bit key) header.d=suse.cz header.i=@suse.cz header.b="mRKI7iRi" Received: from imap1.dmz-prg2.suse.org (unknown [10.150.64.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 519DF21222; Wed, 3 Sep 2025 13:00:09 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1756904410; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=lCAb571qgZEf5w34vVc+FdxmgCSPpBunGHGMPiUvzjc=; b=BLRR/Z63uRSPDpSLnY/EA3qPOqyXZ1GSqrRQxV073ZyBznDRSTLUQu5P213sONgFw7kvzB bziFim3Yve/PXTTC5Jf/NprIdKOhJEs3V1TAfr3BNxMOOO4nXir4UNs6Wx6PokccHiiSu0 ZYjrz+GljxForqFGTUsnlUARS9kt4no= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1756904410; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=lCAb571qgZEf5w34vVc+FdxmgCSPpBunGHGMPiUvzjc=; b=yQsqdA6aMMO7KGp1oGRL6HzLAVpQ9SHkNbckgJ11zmsSLSg+57PIfwmHZPC+qIqlFuK0O5 3qSZ0u2XyJz20oDg== Authentication-Results: smtp-out1.suse.de; none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1756904409; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=lCAb571qgZEf5w34vVc+FdxmgCSPpBunGHGMPiUvzjc=; b=TIZH5ZxK46TTdIt0YXWhyJN2yeTe3ZgTyXXWHsSCinGcLET4+HfQX8AIb5F8ozjKLQjaTm UtdUjg+FU/KCQeSsv/W5rVXnTC9QNBnkUzQKFjLXTCs/vrJZQ29y5LiW8ZJoVFtifwiYTJ 4lN5DyblL4PhQC7n3MmU0D/jGS1+qsw= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1756904409; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=lCAb571qgZEf5w34vVc+FdxmgCSPpBunGHGMPiUvzjc=; b=mRKI7iRiIklOEZ/OJsa93w7/i1JjjJf0aPegWeyufpMB30i4CuN3mMTTR4poXDjnkwKGpV SN2jX912YUOEOsCg== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id 361EB13ACB; Wed, 3 Sep 2025 13:00:09 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id gMTvDNk7uGitOAAAD6G6ig (envelope-from ); Wed, 03 Sep 2025 13:00:09 +0000 From: Vlastimil Babka Date: Wed, 03 Sep 2025 14:59:44 +0200 Subject: [PATCH v7 02/21] slab: simplify init_kmem_cache_nodes() error handling Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20250903-slub-percpu-caches-v7-2-71c114cdefef@suse.cz> References: <20250903-slub-percpu-caches-v7-0-71c114cdefef@suse.cz> In-Reply-To: <20250903-slub-percpu-caches-v7-0-71c114cdefef@suse.cz> To: Suren Baghdasaryan , "Liam R. Howlett" , Christoph Lameter , David Rientjes Cc: Roman Gushchin , Harry Yoo , Uladzislau Rezki , Sidhartha Kumar , linux-mm@kvack.org, linux-kernel@vger.kernel.org, rcu@vger.kernel.org, maple-tree@lists.infradead.org, vbabka@suse.cz X-Mailer: b4 0.14.2 X-Spamd-Result: default: False [-4.30 / 50.00]; BAYES_HAM(-3.00)[100.00%]; NEURAL_HAM_LONG(-1.00)[-1.000]; NEURAL_HAM_SHORT(-0.20)[-0.999]; MIME_GOOD(-0.10)[text/plain]; ARC_NA(0.00)[]; FUZZY_RATELIMITED(0.00)[rspamd.com]; MIME_TRACE(0.00)[0:+]; RCPT_COUNT_TWELVE(0.00)[13]; RCVD_TLS_ALL(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; MID_RHS_MATCH_FROM(0.00)[]; FREEMAIL_ENVRCPT(0.00)[gmail.com]; DKIM_SIGNED(0.00)[suse.cz:s=susede2_rsa,suse.cz:s=susede2_ed25519]; FROM_HAS_DN(0.00)[]; FREEMAIL_CC(0.00)[linux.dev,oracle.com,gmail.com,kvack.org,vger.kernel.org,lists.infradead.org,suse.cz]; TO_DN_SOME(0.00)[]; FROM_EQ_ENVFROM(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; RCVD_COUNT_TWO(0.00)[2]; DBL_BLOCKED_OPENRESOLVER(0.00)[imap1.dmz-prg2.suse.org:helo,suse.cz:mid,suse.cz:email] X-Spam-Flag: NO X-Spam-Level: X-Spam-Score: -4.30 We don't need to call free_kmem_cache_nodes() immediately when failing to allocate a kmem_cache_node, because when we return 0, do_kmem_cache_create() calls __kmem_cache_release() which also performs free_kmem_cache_nodes(). Signed-off-by: Vlastimil Babka Reviewed-by: Harry Yoo --- mm/slub.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/mm/slub.c b/mm/slub.c index 30003763d224c2704a4b93082b8b47af12dcffc5..9f671ec76131c4b0b28d5d568aa= 45842b5efb6d4 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -5669,10 +5669,8 @@ static int init_kmem_cache_nodes(struct kmem_cache *= s) n =3D kmem_cache_alloc_node(kmem_cache_node, GFP_KERNEL, node); =20 - if (!n) { - free_kmem_cache_nodes(s); + if (!n) return 0; - } =20 init_kmem_cache_node(n); s->node[node] =3D n; --=20 2.51.0 From nobody Fri Oct 3 07:42:42 2025 Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.223.130]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A194819C569 for ; Wed, 3 Sep 2025 13:00:20 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=195.135.223.130 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1756904424; cv=none; b=fqtmeG5hfsZMkfI+imcXWDXTWuJNYbpyWWxfZtyx/tLKIivLaMq4WlIqon9NhCjBn1yIESI37ps27FZTYJt7mkiYwgzQLVBHEqnGkeL1O3gk2CkMaMy1O02e/8CWn51UoDEjNt1t4e+HpnFP32rRmSmE0nfjV62R0gVj4gdCuDU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1756904424; c=relaxed/simple; bh=wZytbPB9OD3bWNcHRamCNNgUIDpIgKj66s7xNhnWz+8=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=Ptk8SVXyv2FqgENX89CH4XuF8p91Fdgx/eLH7tIrHwxijczYUXzSqtd5RilJMYErTAvVxtZjuCEBtE3DDfOf27+TJ73nqo1rlJonyuWbFKRi1LRMnizzh2nUVIPRBrFNDtiMMmZimGuKcxspF5AHHSSTn8IWE2HNYgKKvpRgkmk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=suse.cz; spf=pass smtp.mailfrom=suse.cz; dkim=pass (1024-bit key) header.d=suse.cz header.i=@suse.cz header.b=RnsaHV4N; dkim=permerror (0-bit key) header.d=suse.cz header.i=@suse.cz header.b=NQfl5DGW; dkim=pass (1024-bit key) header.d=suse.cz header.i=@suse.cz header.b=fSpNr05h; dkim=permerror (0-bit key) header.d=suse.cz header.i=@suse.cz header.b=Z8XtkDu+; arc=none smtp.client-ip=195.135.223.130 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=suse.cz Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=suse.cz Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=suse.cz header.i=@suse.cz header.b="RnsaHV4N"; dkim=permerror (0-bit key) header.d=suse.cz header.i=@suse.cz header.b="NQfl5DGW"; dkim=pass (1024-bit key) header.d=suse.cz header.i=@suse.cz header.b="fSpNr05h"; dkim=permerror (0-bit key) header.d=suse.cz header.i=@suse.cz header.b="Z8XtkDu+" Received: from imap1.dmz-prg2.suse.org (imap1.dmz-prg2.suse.org [IPv6:2a07:de40:b281:104:10:150:64:97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 67AC621224; Wed, 3 Sep 2025 13:00:09 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1756904410; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=gBzJcITAPCpZAAMcYcOTc3QuaQ2hOZMQHTG70adF3rY=; b=RnsaHV4N6itJlTMutK9QREXbjk6uUjn9LLvSLbI8r5RE4GYR4xvO33sW2MkrdXNisv0OhE G5ZZCB0wDWBT8B/NwAndTDc4toYV+1TDz3X6ot1y8qbPU2sx4IQpG0FLAREunpYkC7YhiC wkN/OBaEpdYlQZrVgOoF7/mTTmMlrGg= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1756904410; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=gBzJcITAPCpZAAMcYcOTc3QuaQ2hOZMQHTG70adF3rY=; b=NQfl5DGWjKZGzafxaeb3hy/Efbf62Jft4rmWu1/Fi4MaJ50FXwi5I8JoC8ynnnQivLePph r/X+ylHPjb8etCAg== Authentication-Results: smtp-out1.suse.de; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=fSpNr05h; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=Z8XtkDu+ DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1756904409; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=gBzJcITAPCpZAAMcYcOTc3QuaQ2hOZMQHTG70adF3rY=; b=fSpNr05hXEONeV74qPz80DRIreHLIjZyVDzwM4vYN68j91o8bV7rsMCBjVXOaycx8GFMRD D3k1KzR4gH/+XCihxAXT/mStZN+pyghyCgwyud2xVBxnProApAiXkjcoRiN/cl5UnrPzU7 dVn4hwC4KGK6zMAatKH0jGV4+I2MCvQ= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1756904409; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=gBzJcITAPCpZAAMcYcOTc3QuaQ2hOZMQHTG70adF3rY=; b=Z8XtkDu+NWwdW+hGPq/4BOhpLANdMWcqB/P5ot5tM0Rk/8NoelYz81xNtJEmtsgxCnDvUv qCw5EqCcF4S+InAA== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id 4AB3013ACF; Wed, 3 Sep 2025 13:00:09 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id qID3Edk7uGitOAAAD6G6ig (envelope-from ); Wed, 03 Sep 2025 13:00:09 +0000 From: Vlastimil Babka Date: Wed, 03 Sep 2025 14:59:45 +0200 Subject: [PATCH v7 03/21] slab: add opt-in caching layer of percpu sheaves Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20250903-slub-percpu-caches-v7-3-71c114cdefef@suse.cz> References: <20250903-slub-percpu-caches-v7-0-71c114cdefef@suse.cz> In-Reply-To: <20250903-slub-percpu-caches-v7-0-71c114cdefef@suse.cz> To: Suren Baghdasaryan , "Liam R. Howlett" , Christoph Lameter , David Rientjes Cc: Roman Gushchin , Harry Yoo , Uladzislau Rezki , Sidhartha Kumar , linux-mm@kvack.org, linux-kernel@vger.kernel.org, rcu@vger.kernel.org, maple-tree@lists.infradead.org, vbabka@suse.cz, Venkat Rao Bagalkote X-Mailer: b4 0.14.2 X-Spam-Level: X-Spam-Flag: NO X-Rspamd-Queue-Id: 67AC621224 X-Rspamd-Action: no action X-Rspamd-Server: rspamd1.dmz-prg2.suse.org X-Spamd-Result: default: False [-4.51 / 50.00]; BAYES_HAM(-3.00)[100.00%]; NEURAL_HAM_LONG(-1.00)[-1.000]; R_DKIM_ALLOW(-0.20)[suse.cz:s=susede2_rsa,suse.cz:s=susede2_ed25519]; NEURAL_HAM_SHORT(-0.20)[-1.000]; MIME_GOOD(-0.10)[text/plain]; MX_GOOD(-0.01)[]; DKIM_SIGNED(0.00)[suse.cz:s=susede2_rsa,suse.cz:s=susede2_ed25519]; FUZZY_RATELIMITED(0.00)[rspamd.com]; RBL_SPAMHAUS_BLOCKED_OPENRESOLVER(0.00)[2a07:de40:b281:104:10:150:64:97:from]; ARC_NA(0.00)[]; RCPT_COUNT_TWELVE(0.00)[14]; TO_MATCH_ENVRCPT_ALL(0.00)[]; MIME_TRACE(0.00)[0:+]; SPAMHAUS_XBL(0.00)[2a07:de40:b281:104:10:150:64:97:from]; FREEMAIL_ENVRCPT(0.00)[gmail.com]; FREEMAIL_CC(0.00)[linux.dev,oracle.com,gmail.com,kvack.org,vger.kernel.org,lists.infradead.org,suse.cz,linux.ibm.com]; RCVD_TLS_ALL(0.00)[]; RCVD_COUNT_TWO(0.00)[2]; FROM_EQ_ENVFROM(0.00)[]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; MID_RHS_MATCH_FROM(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; RECEIVED_SPAMHAUS_BLOCKED_OPENRESOLVER(0.00)[2a07:de40:b281:106:10:150:64:167:received]; DKIM_TRACE(0.00)[suse.cz:+]; R_RATELIMIT(0.00)[to_ip_from(RLfsjnp7neds983g95ihcnuzgq)]; DBL_BLOCKED_OPENRESOLVER(0.00)[imap1.dmz-prg2.suse.org:rdns,imap1.dmz-prg2.suse.org:helo,suse.cz:dkim,suse.cz:mid,suse.cz:email] X-Spam-Score: -4.51 Specifying a non-zero value for a new struct kmem_cache_args field sheaf_capacity will setup a caching layer of percpu arrays called sheaves of given capacity for the created cache. Allocations from the cache will allocate via the percpu sheaves (main or spare) as long as they have no NUMA node preference. Frees will also put the object back into one of the sheaves. When both percpu sheaves are found empty during an allocation, an empty sheaf may be replaced with a full one from the per-node barn. If none are available and the allocation is allowed to block, an empty sheaf is refilled from slab(s) by an internal bulk alloc operation. When both percpu sheaves are full during freeing, the barn can replace a full one with an empty one, unless over a full sheaves limit. In that case a sheaf is flushed to slab(s) by an internal bulk free operation. Flushing sheaves and barns is also wired to the existing cpu flushing and cache shrinking operations. The sheaves do not distinguish NUMA locality of the cached objects. If an allocation is requested with kmem_cache_alloc_node() (or a mempolicy with strict_numa mode enabled) with a specific node (not NUMA_NO_NODE), the sheaves are bypassed. The bulk operations exposed to slab users also try to utilize the sheaves as long as the necessary (full or empty) sheaves are available on the cpu or in the barn. Once depleted, they will fallback to bulk alloc/free to slabs directly to avoid double copying. The sheaf_capacity value is exported in sysfs for observability. Sysfs CONFIG_SLUB_STATS counters alloc_cpu_sheaf and free_cpu_sheaf count objects allocated or freed using the sheaves (and thus not counting towards the other alloc/free path counters). Counters sheaf_refill and sheaf_flush count objects filled or flushed from or to slab pages, and can be used to assess how effective the caching is. The refill and flush operations will also count towards the usual alloc_fastpath/slowpath, free_fastpath/slowpath and other counters for the backing slabs. For barn operations, barn_get and barn_put count how many full sheaves were get from or put to the barn, the _fail variants count how many such requests could not be satisfied mainly because the barn was either empty or full. While the barn also holds empty sheaves to make some operations easier, these are not as critical to mandate own counters. Finally, there are sheaf_alloc/sheaf_free counters. Access to the percpu sheaves is protected by local_trylock() when potential callers include irq context, and local_lock() otherwise (such as when we already know the gfp flags allow blocking). The trylock failures should be rare and we can easily fallback. Each per-NUMA-node barn has a spin_lock. When slub_debug is enabled for a cache with sheaf_capacity also specified, the latter is ignored so that allocations and frees reach the slow path where debugging hooks are processed. Similarly, we ignore it with CONFIG_SLUB_TINY which prefers low memory usage to performance. [boot failure: https://lore.kernel.org/all/583eacf5-c971-451a-9f76-fed0e341= b815@linux.ibm.com/ ] Reported-and-tested-by: Venkat Rao Bagalkote Signed-off-by: Vlastimil Babka --- include/linux/slab.h | 31 ++ mm/slab.h | 2 + mm/slab_common.c | 5 +- mm/slub.c | 1164 ++++++++++++++++++++++++++++++++++++++++++++++= +--- 4 files changed, 1143 insertions(+), 59 deletions(-) diff --git a/include/linux/slab.h b/include/linux/slab.h index d5a8ab98035cf3e3d9043e3b038e1bebeff05b52..49acbcdc6696fd120c402adf757= b3f41660ad50a 100644 --- a/include/linux/slab.h +++ b/include/linux/slab.h @@ -335,6 +335,37 @@ struct kmem_cache_args { * %NULL means no constructor. */ void (*ctor)(void *); + /** + * @sheaf_capacity: Enable sheaves of given capacity for the cache. + * + * With a non-zero value, allocations from the cache go through caching + * arrays called sheaves. Each cpu has a main sheaf that's always + * present, and a spare sheaf that may be not present. When both become + * empty, there's an attempt to replace an empty sheaf with a full sheaf + * from the per-node barn. + * + * When no full sheaf is available, and gfp flags allow blocking, a + * sheaf is allocated and filled from slab(s) using bulk allocation. + * Otherwise the allocation falls back to the normal operation + * allocating a single object from a slab. + * + * Analogically when freeing and both percpu sheaves are full, the barn + * may replace it with an empty sheaf, unless it's over capacity. In + * that case a sheaf is bulk freed to slab pages. + * + * The sheaves do not enforce NUMA placement of objects, so allocations + * via kmem_cache_alloc_node() with a node specified other than + * NUMA_NO_NODE will bypass them. + * + * Bulk allocation and free operations also try to use the cpu sheaves + * and barn, but fallback to using slab pages directly. + * + * When slub_debug is enabled for the cache, the sheaf_capacity argument + * is ignored. + * + * %0 means no sheaves will be created. + */ + unsigned int sheaf_capacity; }; =20 struct kmem_cache *__kmem_cache_create_args(const char *name, diff --git a/mm/slab.h b/mm/slab.h index 248b34c839b7ca39cf14e139c62d116efb97d30f..206987ce44a4d053ebe3b5e5078= 4d2dd23822cd1 100644 --- a/mm/slab.h +++ b/mm/slab.h @@ -235,6 +235,7 @@ struct kmem_cache { #ifndef CONFIG_SLUB_TINY struct kmem_cache_cpu __percpu *cpu_slab; #endif + struct slub_percpu_sheaves __percpu *cpu_sheaves; /* Used for retrieving partial slabs, etc. */ slab_flags_t flags; unsigned long min_partial; @@ -248,6 +249,7 @@ struct kmem_cache { /* Number of per cpu partial slabs to keep around */ unsigned int cpu_partial_slabs; #endif + unsigned int sheaf_capacity; struct kmem_cache_order_objects oo; =20 /* Allocation and freeing of slabs */ diff --git a/mm/slab_common.c b/mm/slab_common.c index bfe7c40eeee1a01c175766935c1e3c0304434a53..e2b197e47866c30acdbd1fee415= 9f262a751c5a7 100644 --- a/mm/slab_common.c +++ b/mm/slab_common.c @@ -163,6 +163,9 @@ int slab_unmergeable(struct kmem_cache *s) return 1; #endif =20 + if (s->cpu_sheaves) + return 1; + /* * We may have set a slab to be unmergeable during bootstrap. */ @@ -321,7 +324,7 @@ struct kmem_cache *__kmem_cache_create_args(const char = *name, object_size - args->usersize < args->useroffset)) args->usersize =3D args->useroffset =3D 0; =20 - if (!args->usersize) + if (!args->usersize && !args->sheaf_capacity) s =3D __kmem_cache_alias(name, object_size, args->align, flags, args->ctor); if (s) diff --git a/mm/slub.c b/mm/slub.c index 9f671ec76131c4b0b28d5d568aa45842b5efb6d4..42cb5848f1cecb17174967ff8b1= 02b20a50110e3 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -363,8 +363,10 @@ static inline void debugfs_slab_add(struct kmem_cache = *s) { } #endif =20 enum stat_item { + ALLOC_PCS, /* Allocation from percpu sheaf */ ALLOC_FASTPATH, /* Allocation from cpu slab */ ALLOC_SLOWPATH, /* Allocation by getting a new cpu slab */ + FREE_PCS, /* Free to percpu sheaf */ FREE_FASTPATH, /* Free to cpu slab */ FREE_SLOWPATH, /* Freeing not to cpu slab */ FREE_FROZEN, /* Freeing to frozen slab */ @@ -389,6 +391,14 @@ enum stat_item { CPU_PARTIAL_FREE, /* Refill cpu partial on free */ CPU_PARTIAL_NODE, /* Refill cpu partial from node partial */ CPU_PARTIAL_DRAIN, /* Drain cpu partial to node partial */ + SHEAF_FLUSH, /* Objects flushed from a sheaf */ + SHEAF_REFILL, /* Objects refilled to a sheaf */ + SHEAF_ALLOC, /* Allocation of an empty sheaf */ + SHEAF_FREE, /* Freeing of an empty sheaf */ + BARN_GET, /* Got full sheaf from barn */ + BARN_GET_FAIL, /* Failed to get full sheaf from barn */ + BARN_PUT, /* Put full sheaf to barn */ + BARN_PUT_FAIL, /* Failed to put full sheaf to barn */ NR_SLUB_STAT_ITEMS }; =20 @@ -435,6 +445,32 @@ void stat_add(const struct kmem_cache *s, enum stat_it= em si, int v) #endif } =20 +#define MAX_FULL_SHEAVES 10 +#define MAX_EMPTY_SHEAVES 10 + +struct node_barn { + spinlock_t lock; + struct list_head sheaves_full; + struct list_head sheaves_empty; + unsigned int nr_full; + unsigned int nr_empty; +}; + +struct slab_sheaf { + union { + struct rcu_head rcu_head; + struct list_head barn_list; + }; + unsigned int size; + void *objects[]; +}; + +struct slub_percpu_sheaves { + local_trylock_t lock; + struct slab_sheaf *main; /* never NULL when unlocked */ + struct slab_sheaf *spare; /* empty or full, may be NULL */ +}; + /* * The slab lists for all objects. */ @@ -447,6 +483,7 @@ struct kmem_cache_node { atomic_long_t total_objects; struct list_head full; #endif + struct node_barn *barn; }; =20 static inline struct kmem_cache_node *get_node(struct kmem_cache *s, int n= ode) @@ -454,6 +491,12 @@ static inline struct kmem_cache_node *get_node(struct = kmem_cache *s, int node) return s->node[node]; } =20 +/* Get the barn of the current cpu's memory node */ +static inline struct node_barn *get_barn(struct kmem_cache *s) +{ + return get_node(s, numa_mem_id())->barn; +} + /* * Iterator over all nodes. The body will be executed for each node that h= as * a kmem_cache_node structure allocated (which is true for all online nod= es) @@ -470,12 +513,19 @@ static inline struct kmem_cache_node *get_node(struct= kmem_cache *s, int node) */ static nodemask_t slab_nodes; =20 -#ifndef CONFIG_SLUB_TINY /* * Workqueue used for flush_cpu_slab(). */ static struct workqueue_struct *flushwq; -#endif + +struct slub_flush_work { + struct work_struct work; + struct kmem_cache *s; + bool skip; +}; + +static DEFINE_MUTEX(flush_lock); +static DEFINE_PER_CPU(struct slub_flush_work, slub_flush); =20 /******************************************************************** * Core slab cache functions @@ -2473,6 +2523,360 @@ static void *setup_object(struct kmem_cache *s, voi= d *object) return object; } =20 +static struct slab_sheaf *alloc_empty_sheaf(struct kmem_cache *s, gfp_t gf= p) +{ + struct slab_sheaf *sheaf =3D kzalloc(struct_size(sheaf, objects, + s->sheaf_capacity), gfp); + + if (unlikely(!sheaf)) + return NULL; + + stat(s, SHEAF_ALLOC); + + return sheaf; +} + +static void free_empty_sheaf(struct kmem_cache *s, struct slab_sheaf *shea= f) +{ + kfree(sheaf); + + stat(s, SHEAF_FREE); +} + +static int __kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags, + size_t size, void **p); + + +static int refill_sheaf(struct kmem_cache *s, struct slab_sheaf *sheaf, + gfp_t gfp) +{ + int to_fill =3D s->sheaf_capacity - sheaf->size; + int filled; + + if (!to_fill) + return 0; + + filled =3D __kmem_cache_alloc_bulk(s, gfp, to_fill, + &sheaf->objects[sheaf->size]); + + sheaf->size +=3D filled; + + stat_add(s, SHEAF_REFILL, filled); + + if (filled < to_fill) + return -ENOMEM; + + return 0; +} + + +static struct slab_sheaf *alloc_full_sheaf(struct kmem_cache *s, gfp_t gfp) +{ + struct slab_sheaf *sheaf =3D alloc_empty_sheaf(s, gfp); + + if (!sheaf) + return NULL; + + if (refill_sheaf(s, sheaf, gfp)) { + free_empty_sheaf(s, sheaf); + return NULL; + } + + return sheaf; +} + +/* + * Maximum number of objects freed during a single flush of main pcs sheaf. + * Translates directly to an on-stack array size. + */ +#define PCS_BATCH_MAX 32U + +static void __kmem_cache_free_bulk(struct kmem_cache *s, size_t size, void= **p); + +/* + * Free all objects from the main sheaf. In order to perform + * __kmem_cache_free_bulk() outside of cpu_sheaves->lock, work in batches = where + * object pointers are moved to a on-stack array under the lock. To bound = the + * stack usage, limit each batch to PCS_BATCH_MAX. + * + * returns true if at least partially flushed + */ +static bool sheaf_flush_main(struct kmem_cache *s) +{ + struct slub_percpu_sheaves *pcs; + unsigned int batch, remaining; + void *objects[PCS_BATCH_MAX]; + struct slab_sheaf *sheaf; + bool ret =3D false; + +next_batch: + if (!local_trylock(&s->cpu_sheaves->lock)) + return ret; + + pcs =3D this_cpu_ptr(s->cpu_sheaves); + sheaf =3D pcs->main; + + batch =3D min(PCS_BATCH_MAX, sheaf->size); + + sheaf->size -=3D batch; + memcpy(objects, sheaf->objects + sheaf->size, batch * sizeof(void *)); + + remaining =3D sheaf->size; + + local_unlock(&s->cpu_sheaves->lock); + + __kmem_cache_free_bulk(s, batch, &objects[0]); + + stat_add(s, SHEAF_FLUSH, batch); + + ret =3D true; + + if (remaining) + goto next_batch; + + return ret; +} + +/* + * Free all objects from a sheaf that's unused, i.e. not linked to any + * cpu_sheaves, so we need no locking and batching. The locking is also not + * necessary when flushing cpu's sheaves (both spare and main) during cpu + * hotremove as the cpu is not executing anymore. + */ +static void sheaf_flush_unused(struct kmem_cache *s, struct slab_sheaf *sh= eaf) +{ + if (!sheaf->size) + return; + + stat_add(s, SHEAF_FLUSH, sheaf->size); + + __kmem_cache_free_bulk(s, sheaf->size, &sheaf->objects[0]); + + sheaf->size =3D 0; +} + +/* + * Caller needs to make sure migration is disabled in order to fully flush + * single cpu's sheaves + * + * must not be called from an irq + * + * flushing operations are rare so let's keep it simple and flush to slabs + * directly, skipping the barn + */ +static void pcs_flush_all(struct kmem_cache *s) +{ + struct slub_percpu_sheaves *pcs; + struct slab_sheaf *spare; + + local_lock(&s->cpu_sheaves->lock); + pcs =3D this_cpu_ptr(s->cpu_sheaves); + + spare =3D pcs->spare; + pcs->spare =3D NULL; + + local_unlock(&s->cpu_sheaves->lock); + + if (spare) { + sheaf_flush_unused(s, spare); + free_empty_sheaf(s, spare); + } + + sheaf_flush_main(s); +} + +static void __pcs_flush_all_cpu(struct kmem_cache *s, unsigned int cpu) +{ + struct slub_percpu_sheaves *pcs; + + pcs =3D per_cpu_ptr(s->cpu_sheaves, cpu); + + /* The cpu is not executing anymore so we don't need pcs->lock */ + sheaf_flush_unused(s, pcs->main); + if (pcs->spare) { + sheaf_flush_unused(s, pcs->spare); + free_empty_sheaf(s, pcs->spare); + pcs->spare =3D NULL; + } +} + +static void pcs_destroy(struct kmem_cache *s) +{ + int cpu; + + for_each_possible_cpu(cpu) { + struct slub_percpu_sheaves *pcs; + + pcs =3D per_cpu_ptr(s->cpu_sheaves, cpu); + + /* can happen when unwinding failed create */ + if (!pcs->main) + continue; + + /* + * We have already passed __kmem_cache_shutdown() so everything + * was flushed and there should be no objects allocated from + * slabs, otherwise kmem_cache_destroy() would have aborted. + * Therefore something would have to be really wrong if the + * warnings here trigger, and we should rather leave objects and + * sheaves to leak in that case. + */ + + WARN_ON(pcs->spare); + + if (!WARN_ON(pcs->main->size)) { + free_empty_sheaf(s, pcs->main); + pcs->main =3D NULL; + } + } + + free_percpu(s->cpu_sheaves); + s->cpu_sheaves =3D NULL; +} + +static struct slab_sheaf *barn_get_empty_sheaf(struct node_barn *barn) +{ + struct slab_sheaf *empty =3D NULL; + unsigned long flags; + + spin_lock_irqsave(&barn->lock, flags); + + if (barn->nr_empty) { + empty =3D list_first_entry(&barn->sheaves_empty, + struct slab_sheaf, barn_list); + list_del(&empty->barn_list); + barn->nr_empty--; + } + + spin_unlock_irqrestore(&barn->lock, flags); + + return empty; +} + +/* + * The following two functions are used mainly in cases where we have to u= ndo an + * intended action due to a race or cpu migration. Thus they do not check = the + * empty or full sheaf limits for simplicity. + */ + +static void barn_put_empty_sheaf(struct node_barn *barn, struct slab_sheaf= *sheaf) +{ + unsigned long flags; + + spin_lock_irqsave(&barn->lock, flags); + + list_add(&sheaf->barn_list, &barn->sheaves_empty); + barn->nr_empty++; + + spin_unlock_irqrestore(&barn->lock, flags); +} + +static void barn_put_full_sheaf(struct node_barn *barn, struct slab_sheaf = *sheaf) +{ + unsigned long flags; + + spin_lock_irqsave(&barn->lock, flags); + + list_add(&sheaf->barn_list, &barn->sheaves_full); + barn->nr_full++; + + spin_unlock_irqrestore(&barn->lock, flags); +} + +/* + * If a full sheaf is available, return it and put the supplied empty one = to + * barn. We ignore the limit on empty sheaves as the number of sheaves doe= sn't + * change. + */ +static struct slab_sheaf * +barn_replace_empty_sheaf(struct node_barn *barn, struct slab_sheaf *empty) +{ + struct slab_sheaf *full =3D NULL; + unsigned long flags; + + spin_lock_irqsave(&barn->lock, flags); + + if (barn->nr_full) { + full =3D list_first_entry(&barn->sheaves_full, struct slab_sheaf, + barn_list); + list_del(&full->barn_list); + list_add(&empty->barn_list, &barn->sheaves_empty); + barn->nr_full--; + barn->nr_empty++; + } + + spin_unlock_irqrestore(&barn->lock, flags); + + return full; +} + +/* + * If an empty sheaf is available, return it and put the supplied full one= to + * barn. But if there are too many full sheaves, reject this with -E2BIG. + */ +static struct slab_sheaf * +barn_replace_full_sheaf(struct node_barn *barn, struct slab_sheaf *full) +{ + struct slab_sheaf *empty; + unsigned long flags; + + spin_lock_irqsave(&barn->lock, flags); + + if (barn->nr_full >=3D MAX_FULL_SHEAVES) { + empty =3D ERR_PTR(-E2BIG); + } else if (!barn->nr_empty) { + empty =3D ERR_PTR(-ENOMEM); + } else { + empty =3D list_first_entry(&barn->sheaves_empty, struct slab_sheaf, + barn_list); + list_del(&empty->barn_list); + list_add(&full->barn_list, &barn->sheaves_full); + barn->nr_empty--; + barn->nr_full++; + } + + spin_unlock_irqrestore(&barn->lock, flags); + + return empty; +} + +static void barn_init(struct node_barn *barn) +{ + spin_lock_init(&barn->lock); + INIT_LIST_HEAD(&barn->sheaves_full); + INIT_LIST_HEAD(&barn->sheaves_empty); + barn->nr_full =3D 0; + barn->nr_empty =3D 0; +} + +static void barn_shrink(struct kmem_cache *s, struct node_barn *barn) +{ + struct list_head empty_list; + struct list_head full_list; + struct slab_sheaf *sheaf, *sheaf2; + unsigned long flags; + + INIT_LIST_HEAD(&empty_list); + INIT_LIST_HEAD(&full_list); + + spin_lock_irqsave(&barn->lock, flags); + + list_splice_init(&barn->sheaves_full, &full_list); + barn->nr_full =3D 0; + list_splice_init(&barn->sheaves_empty, &empty_list); + barn->nr_empty =3D 0; + + spin_unlock_irqrestore(&barn->lock, flags); + + list_for_each_entry_safe(sheaf, sheaf2, &full_list, barn_list) { + sheaf_flush_unused(s, sheaf); + free_empty_sheaf(s, sheaf); + } + + list_for_each_entry_safe(sheaf, sheaf2, &empty_list, barn_list) + free_empty_sheaf(s, sheaf); +} + /* * Slab allocation and freeing */ @@ -3344,11 +3748,42 @@ static inline void __flush_cpu_slab(struct kmem_cac= he *s, int cpu) put_partials_cpu(s, c); } =20 -struct slub_flush_work { - struct work_struct work; - struct kmem_cache *s; - bool skip; -}; +static inline void flush_this_cpu_slab(struct kmem_cache *s) +{ + struct kmem_cache_cpu *c =3D this_cpu_ptr(s->cpu_slab); + + if (c->slab) + flush_slab(s, c); + + put_partials(s); +} + +static bool has_cpu_slab(int cpu, struct kmem_cache *s) +{ + struct kmem_cache_cpu *c =3D per_cpu_ptr(s->cpu_slab, cpu); + + return c->slab || slub_percpu_partial(c); +} + +#else /* CONFIG_SLUB_TINY */ +static inline void __flush_cpu_slab(struct kmem_cache *s, int cpu) { } +static inline bool has_cpu_slab(int cpu, struct kmem_cache *s) { return fa= lse; } +static inline void flush_this_cpu_slab(struct kmem_cache *s) { } +#endif /* CONFIG_SLUB_TINY */ + +static bool has_pcs_used(int cpu, struct kmem_cache *s) +{ + struct slub_percpu_sheaves *pcs; + + if (!s->cpu_sheaves) + return false; + + pcs =3D per_cpu_ptr(s->cpu_sheaves, cpu); + + return (pcs->spare || pcs->main->size); +} + +static void pcs_flush_all(struct kmem_cache *s); =20 /* * Flush cpu slab. @@ -3358,30 +3793,18 @@ struct slub_flush_work { static void flush_cpu_slab(struct work_struct *w) { struct kmem_cache *s; - struct kmem_cache_cpu *c; struct slub_flush_work *sfw; =20 sfw =3D container_of(w, struct slub_flush_work, work); =20 s =3D sfw->s; - c =3D this_cpu_ptr(s->cpu_slab); - - if (c->slab) - flush_slab(s, c); - - put_partials(s); -} =20 -static bool has_cpu_slab(int cpu, struct kmem_cache *s) -{ - struct kmem_cache_cpu *c =3D per_cpu_ptr(s->cpu_slab, cpu); + if (s->cpu_sheaves) + pcs_flush_all(s); =20 - return c->slab || slub_percpu_partial(c); + flush_this_cpu_slab(s); } =20 -static DEFINE_MUTEX(flush_lock); -static DEFINE_PER_CPU(struct slub_flush_work, slub_flush); - static void flush_all_cpus_locked(struct kmem_cache *s) { struct slub_flush_work *sfw; @@ -3392,7 +3815,7 @@ static void flush_all_cpus_locked(struct kmem_cache *= s) =20 for_each_online_cpu(cpu) { sfw =3D &per_cpu(slub_flush, cpu); - if (!has_cpu_slab(cpu, s)) { + if (!has_cpu_slab(cpu, s) && !has_pcs_used(cpu, s)) { sfw->skip =3D true; continue; } @@ -3428,19 +3851,15 @@ static int slub_cpu_dead(unsigned int cpu) struct kmem_cache *s; =20 mutex_lock(&slab_mutex); - list_for_each_entry(s, &slab_caches, list) + list_for_each_entry(s, &slab_caches, list) { __flush_cpu_slab(s, cpu); + if (s->cpu_sheaves) + __pcs_flush_all_cpu(s, cpu); + } mutex_unlock(&slab_mutex); return 0; } =20 -#else /* CONFIG_SLUB_TINY */ -static inline void flush_all_cpus_locked(struct kmem_cache *s) { } -static inline void flush_all(struct kmem_cache *s) { } -static inline void __flush_cpu_slab(struct kmem_cache *s, int cpu) { } -static inline int slub_cpu_dead(unsigned int cpu) { return 0; } -#endif /* CONFIG_SLUB_TINY */ - /* * Check if the objects in a per cpu structure fit numa * locality expectations. @@ -4191,30 +4610,240 @@ bool slab_post_alloc_hook(struct kmem_cache *s, st= ruct list_lru *lru, } =20 /* - * Inlined fastpath so that allocation functions (kmalloc, kmem_cache_allo= c) - * have the fastpath folded into their functions. So no function call - * overhead for requests that can be satisfied on the fastpath. - * - * The fastpath works by first checking if the lockless freelist can be us= ed. - * If not then __slab_alloc is called for slow processing. + * Replace the empty main sheaf with a (at least partially) full sheaf. * - * Otherwise we can simply pick the next object from the lockless free lis= t. + * Must be called with the cpu_sheaves local lock locked. If successful, r= eturns + * the pcs pointer and the local lock locked (possibly on a different cpu = than + * initially called). If not successful, returns NULL and the local lock + * unlocked. */ -static __fastpath_inline void *slab_alloc_node(struct kmem_cache *s, struc= t list_lru *lru, - gfp_t gfpflags, int node, unsigned long addr, size_t orig_size) +static struct slub_percpu_sheaves * +__pcs_replace_empty_main(struct kmem_cache *s, struct slub_percpu_sheaves = *pcs, gfp_t gfp) { - void *object; - bool init =3D false; + struct slab_sheaf *empty =3D NULL; + struct slab_sheaf *full; + struct node_barn *barn; + bool can_alloc; =20 - s =3D slab_pre_alloc_hook(s, gfpflags); - if (unlikely(!s)) + lockdep_assert_held(this_cpu_ptr(&s->cpu_sheaves->lock)); + + if (pcs->spare && pcs->spare->size > 0) { + swap(pcs->main, pcs->spare); + return pcs; + } + + barn =3D get_barn(s); + + full =3D barn_replace_empty_sheaf(barn, pcs->main); + + if (full) { + stat(s, BARN_GET); + pcs->main =3D full; + return pcs; + } + + stat(s, BARN_GET_FAIL); + + can_alloc =3D gfpflags_allow_blocking(gfp); + + if (can_alloc) { + if (pcs->spare) { + empty =3D pcs->spare; + pcs->spare =3D NULL; + } else { + empty =3D barn_get_empty_sheaf(barn); + } + } + + local_unlock(&s->cpu_sheaves->lock); + + if (!can_alloc) + return NULL; + + if (empty) { + if (!refill_sheaf(s, empty, gfp)) { + full =3D empty; + } else { + /* + * we must be very low on memory so don't bother + * with the barn + */ + free_empty_sheaf(s, empty); + } + } else { + full =3D alloc_full_sheaf(s, gfp); + } + + if (!full) + return NULL; + + /* + * we can reach here only when gfpflags_allow_blocking + * so this must not be an irq + */ + local_lock(&s->cpu_sheaves->lock); + pcs =3D this_cpu_ptr(s->cpu_sheaves); + + /* + * If we are returning empty sheaf, we either got it from the + * barn or had to allocate one. If we are returning a full + * sheaf, it's due to racing or being migrated to a different + * cpu. Breaching the barn's sheaf limits should be thus rare + * enough so just ignore them to simplify the recovery. + */ + + if (pcs->main->size =3D=3D 0) { + barn_put_empty_sheaf(barn, pcs->main); + pcs->main =3D full; + return pcs; + } + + if (!pcs->spare) { + pcs->spare =3D full; + return pcs; + } + + if (pcs->spare->size =3D=3D 0) { + barn_put_empty_sheaf(barn, pcs->spare); + pcs->spare =3D full; + return pcs; + } + + barn_put_full_sheaf(barn, full); + stat(s, BARN_PUT); + + return pcs; +} + +static __fastpath_inline +void *alloc_from_pcs(struct kmem_cache *s, gfp_t gfp) +{ + struct slub_percpu_sheaves *pcs; + void *object; + +#ifdef CONFIG_NUMA + if (static_branch_unlikely(&strict_numa)) { + if (current->mempolicy) + return NULL; + } +#endif + + if (!local_trylock(&s->cpu_sheaves->lock)) + return NULL; + + pcs =3D this_cpu_ptr(s->cpu_sheaves); + + if (unlikely(pcs->main->size =3D=3D 0)) { + pcs =3D __pcs_replace_empty_main(s, pcs, gfp); + if (unlikely(!pcs)) + return NULL; + } + + object =3D pcs->main->objects[--pcs->main->size]; + + local_unlock(&s->cpu_sheaves->lock); + + stat(s, ALLOC_PCS); + + return object; +} + +static __fastpath_inline +unsigned int alloc_from_pcs_bulk(struct kmem_cache *s, size_t size, void *= *p) +{ + struct slub_percpu_sheaves *pcs; + struct slab_sheaf *main; + unsigned int allocated =3D 0; + unsigned int batch; + +next_batch: + if (!local_trylock(&s->cpu_sheaves->lock)) + return allocated; + + pcs =3D this_cpu_ptr(s->cpu_sheaves); + + if (unlikely(pcs->main->size =3D=3D 0)) { + + struct slab_sheaf *full; + + if (pcs->spare && pcs->spare->size > 0) { + swap(pcs->main, pcs->spare); + goto do_alloc; + } + + full =3D barn_replace_empty_sheaf(get_barn(s), pcs->main); + + if (full) { + stat(s, BARN_GET); + pcs->main =3D full; + goto do_alloc; + } + + stat(s, BARN_GET_FAIL); + + local_unlock(&s->cpu_sheaves->lock); + + /* + * Once full sheaves in barn are depleted, let the bulk + * allocation continue from slab pages, otherwise we would just + * be copying arrays of pointers twice. + */ + return allocated; + } + +do_alloc: + + main =3D pcs->main; + batch =3D min(size, main->size); + + main->size -=3D batch; + memcpy(p, main->objects + main->size, batch * sizeof(void *)); + + local_unlock(&s->cpu_sheaves->lock); + + stat_add(s, ALLOC_PCS, batch); + + allocated +=3D batch; + + if (batch < size) { + p +=3D batch; + size -=3D batch; + goto next_batch; + } + + return allocated; +} + + +/* + * Inlined fastpath so that allocation functions (kmalloc, kmem_cache_allo= c) + * have the fastpath folded into their functions. So no function call + * overhead for requests that can be satisfied on the fastpath. + * + * The fastpath works by first checking if the lockless freelist can be us= ed. + * If not then __slab_alloc is called for slow processing. + * + * Otherwise we can simply pick the next object from the lockless free lis= t. + */ +static __fastpath_inline void *slab_alloc_node(struct kmem_cache *s, struc= t list_lru *lru, + gfp_t gfpflags, int node, unsigned long addr, size_t orig_size) +{ + void *object; + bool init =3D false; + + s =3D slab_pre_alloc_hook(s, gfpflags); + if (unlikely(!s)) return NULL; =20 object =3D kfence_alloc(s, orig_size, gfpflags); if (unlikely(object)) goto out; =20 - object =3D __slab_alloc_node(s, gfpflags, node, addr, orig_size); + if (s->cpu_sheaves && node =3D=3D NUMA_NO_NODE) + object =3D alloc_from_pcs(s, gfpflags); + + if (!object) + object =3D __slab_alloc_node(s, gfpflags, node, addr, orig_size); =20 maybe_wipe_obj_freeptr(s, object); init =3D slab_want_init_on_alloc(gfpflags, s); @@ -4591,6 +5220,295 @@ static void __slab_free(struct kmem_cache *s, struc= t slab *slab, discard_slab(s, slab); } =20 +/* + * pcs is locked. We should have get rid of the spare sheaf and obtained an + * empty sheaf, while the main sheaf is full. We want to install the empty= sheaf + * as a main sheaf, and make the current main sheaf a spare sheaf. + * + * However due to having relinquished the cpu_sheaves lock when obtaining + * the empty sheaf, we need to handle some unlikely but possible cases. + * + * If we put any sheaf to barn here, it's because we were interrupted or h= ave + * been migrated to a different cpu, which should be rare enough so just i= gnore + * the barn's limits to simplify the handling. + * + * An alternative scenario that gets us here is when we fail + * barn_replace_full_sheaf(), because there's no empty sheaf available in = the + * barn, so we had to allocate it by alloc_empty_sheaf(). But because we s= aw the + * limit on full sheaves was not exceeded, we assume it didn't change and = just + * put the full sheaf there. + */ +static void __pcs_install_empty_sheaf(struct kmem_cache *s, + struct slub_percpu_sheaves *pcs, struct slab_sheaf *empty) +{ + struct node_barn *barn; + + lockdep_assert_held(this_cpu_ptr(&s->cpu_sheaves->lock)); + + /* This is what we expect to find if nobody interrupted us. */ + if (likely(!pcs->spare)) { + pcs->spare =3D pcs->main; + pcs->main =3D empty; + return; + } + + barn =3D get_barn(s); + + /* + * Unlikely because if the main sheaf had space, we would have just + * freed to it. Get rid of our empty sheaf. + */ + if (pcs->main->size < s->sheaf_capacity) { + barn_put_empty_sheaf(barn, empty); + return; + } + + /* Also unlikely for the same reason */ + if (pcs->spare->size < s->sheaf_capacity) { + swap(pcs->main, pcs->spare); + barn_put_empty_sheaf(barn, empty); + return; + } + + /* + * We probably failed barn_replace_full_sheaf() due to no empty sheaf + * available there, but we allocated one, so finish the job. + */ + barn_put_full_sheaf(barn, pcs->main); + stat(s, BARN_PUT); + pcs->main =3D empty; +} + +/* + * Replace the full main sheaf with a (at least partially) empty sheaf. + * + * Must be called with the cpu_sheaves local lock locked. If successful, r= eturns + * the pcs pointer and the local lock locked (possibly on a different cpu = than + * initially called). If not successful, returns NULL and the local lock + * unlocked. + */ +static struct slub_percpu_sheaves * +__pcs_replace_full_main(struct kmem_cache *s, struct slub_percpu_sheaves *= pcs) +{ + struct slab_sheaf *empty; + struct node_barn *barn; + bool put_fail; + +restart: + lockdep_assert_held(this_cpu_ptr(&s->cpu_sheaves->lock)); + + barn =3D get_barn(s); + put_fail =3D false; + + if (!pcs->spare) { + empty =3D barn_get_empty_sheaf(barn); + if (empty) { + pcs->spare =3D pcs->main; + pcs->main =3D empty; + return pcs; + } + goto alloc_empty; + } + + if (pcs->spare->size < s->sheaf_capacity) { + swap(pcs->main, pcs->spare); + return pcs; + } + + empty =3D barn_replace_full_sheaf(barn, pcs->main); + + if (!IS_ERR(empty)) { + stat(s, BARN_PUT); + pcs->main =3D empty; + return pcs; + } + + if (PTR_ERR(empty) =3D=3D -E2BIG) { + /* Since we got here, spare exists and is full */ + struct slab_sheaf *to_flush =3D pcs->spare; + + stat(s, BARN_PUT_FAIL); + + pcs->spare =3D NULL; + local_unlock(&s->cpu_sheaves->lock); + + sheaf_flush_unused(s, to_flush); + empty =3D to_flush; + goto got_empty; + } + + /* + * We could not replace full sheaf because barn had no empty + * sheaves. We can still allocate it and put the full sheaf in + * __pcs_install_empty_sheaf(), but if we fail to allocate it, + * make sure to count the fail. + */ + put_fail =3D true; + +alloc_empty: + local_unlock(&s->cpu_sheaves->lock); + + empty =3D alloc_empty_sheaf(s, GFP_NOWAIT); + if (empty) + goto got_empty; + + if (put_fail) + stat(s, BARN_PUT_FAIL); + + if (!sheaf_flush_main(s)) + return NULL; + + if (!local_trylock(&s->cpu_sheaves->lock)) + return NULL; + + pcs =3D this_cpu_ptr(s->cpu_sheaves); + + /* + * we flushed the main sheaf so it should be empty now, + * but in case we got preempted or migrated, we need to + * check again + */ + if (pcs->main->size =3D=3D s->sheaf_capacity) + goto restart; + + return pcs; + +got_empty: + if (!local_trylock(&s->cpu_sheaves->lock)) { + barn_put_empty_sheaf(barn, empty); + return NULL; + } + + pcs =3D this_cpu_ptr(s->cpu_sheaves); + __pcs_install_empty_sheaf(s, pcs, empty); + + return pcs; +} + +/* + * Free an object to the percpu sheaves. + * The object is expected to have passed slab_free_hook() already. + */ +static __fastpath_inline +bool free_to_pcs(struct kmem_cache *s, void *object) +{ + struct slub_percpu_sheaves *pcs; + + if (!local_trylock(&s->cpu_sheaves->lock)) + return false; + + pcs =3D this_cpu_ptr(s->cpu_sheaves); + + if (unlikely(pcs->main->size =3D=3D s->sheaf_capacity)) { + + pcs =3D __pcs_replace_full_main(s, pcs); + if (unlikely(!pcs)) + return false; + } + + pcs->main->objects[pcs->main->size++] =3D object; + + local_unlock(&s->cpu_sheaves->lock); + + stat(s, FREE_PCS); + + return true; +} + +/* + * Bulk free objects to the percpu sheaves. + * Unlike free_to_pcs() this includes the calls to all necessary hooks + * and the fallback to freeing to slab pages. + */ +static void free_to_pcs_bulk(struct kmem_cache *s, size_t size, void **p) +{ + struct slub_percpu_sheaves *pcs; + struct slab_sheaf *main, *empty; + bool init =3D slab_want_init_on_free(s); + unsigned int batch, i =3D 0; + struct node_barn *barn; + + while (i < size) { + struct slab *slab =3D virt_to_slab(p[i]); + + memcg_slab_free_hook(s, slab, p + i, 1); + alloc_tagging_slab_free_hook(s, slab, p + i, 1); + + if (unlikely(!slab_free_hook(s, p[i], init, false))) { + p[i] =3D p[--size]; + if (!size) + return; + continue; + } + + i++; + } + +next_batch: + if (!local_trylock(&s->cpu_sheaves->lock)) + goto fallback; + + pcs =3D this_cpu_ptr(s->cpu_sheaves); + + if (likely(pcs->main->size < s->sheaf_capacity)) + goto do_free; + + barn =3D get_barn(s); + + if (!pcs->spare) { + empty =3D barn_get_empty_sheaf(barn); + if (!empty) + goto no_empty; + + pcs->spare =3D pcs->main; + pcs->main =3D empty; + goto do_free; + } + + if (pcs->spare->size < s->sheaf_capacity) { + swap(pcs->main, pcs->spare); + goto do_free; + } + + empty =3D barn_replace_full_sheaf(barn, pcs->main); + if (IS_ERR(empty)) { + stat(s, BARN_PUT_FAIL); + goto no_empty; + } + + stat(s, BARN_PUT); + pcs->main =3D empty; + +do_free: + main =3D pcs->main; + batch =3D min(size, s->sheaf_capacity - main->size); + + memcpy(main->objects + main->size, p, batch * sizeof(void *)); + main->size +=3D batch; + + local_unlock(&s->cpu_sheaves->lock); + + stat_add(s, FREE_PCS, batch); + + if (batch < size) { + p +=3D batch; + size -=3D batch; + goto next_batch; + } + + return; + +no_empty: + local_unlock(&s->cpu_sheaves->lock); + + /* + * if we depleted all empty sheaves in the barn or there are too + * many full sheaves, free the rest to slab pages + */ +fallback: + __kmem_cache_free_bulk(s, size, p); +} + #ifndef CONFIG_SLUB_TINY /* * Fastpath with forced inlining to produce a kfree and kmem_cache_free th= at @@ -4677,7 +5595,10 @@ void slab_free(struct kmem_cache *s, struct slab *sl= ab, void *object, memcg_slab_free_hook(s, slab, &object, 1); alloc_tagging_slab_free_hook(s, slab, &object, 1); =20 - if (likely(slab_free_hook(s, object, slab_want_init_on_free(s), false))) + if (unlikely(!slab_free_hook(s, object, slab_want_init_on_free(s), false)= )) + return; + + if (!s->cpu_sheaves || !free_to_pcs(s, object)) do_slab_free(s, slab, object, object, 1, addr); } =20 @@ -5273,6 +6194,15 @@ void kmem_cache_free_bulk(struct kmem_cache *s, size= _t size, void **p) if (!size) return; =20 + /* + * freeing to sheaves is so incompatible with the detached freelist so + * once we go that way, we have to do everything differently + */ + if (s && s->cpu_sheaves) { + free_to_pcs_bulk(s, size, p); + return; + } + do { struct detached_freelist df; =20 @@ -5391,7 +6321,7 @@ static int __kmem_cache_alloc_bulk(struct kmem_cache = *s, gfp_t flags, int kmem_cache_alloc_bulk_noprof(struct kmem_cache *s, gfp_t flags, size_t= size, void **p) { - int i; + unsigned int i =3D 0; =20 if (!size) return 0; @@ -5400,9 +6330,20 @@ int kmem_cache_alloc_bulk_noprof(struct kmem_cache *= s, gfp_t flags, size_t size, if (unlikely(!s)) return 0; =20 - i =3D __kmem_cache_alloc_bulk(s, flags, size, p); - if (unlikely(i =3D=3D 0)) - return 0; + if (s->cpu_sheaves) + i =3D alloc_from_pcs_bulk(s, size, p); + + if (i < size) { + /* + * If we ran out of memory, don't bother with freeing back to + * the percpu sheaves, we have bigger problems. + */ + if (unlikely(__kmem_cache_alloc_bulk(s, flags, size - i, p + i) =3D=3D 0= )) { + if (i > 0) + __kmem_cache_free_bulk(s, i, p); + return 0; + } + } =20 /* * memcg and kmem_cache debug support and memory initialization. @@ -5412,11 +6353,11 @@ int kmem_cache_alloc_bulk_noprof(struct kmem_cache = *s, gfp_t flags, size_t size, slab_want_init_on_alloc(flags, s), s->object_size))) { return 0; } - return i; + + return size; } EXPORT_SYMBOL(kmem_cache_alloc_bulk_noprof); =20 - /* * Object placement in a slab is made very easy because we always start at * offset 0. If we tune the size of the object to the alignment then we can @@ -5550,7 +6491,7 @@ static inline int calculate_order(unsigned int size) } =20 static void -init_kmem_cache_node(struct kmem_cache_node *n) +init_kmem_cache_node(struct kmem_cache_node *n, struct node_barn *barn) { n->nr_partial =3D 0; spin_lock_init(&n->list_lock); @@ -5560,6 +6501,9 @@ init_kmem_cache_node(struct kmem_cache_node *n) atomic_long_set(&n->total_objects, 0); INIT_LIST_HEAD(&n->full); #endif + n->barn =3D barn; + if (barn) + barn_init(barn); } =20 #ifndef CONFIG_SLUB_TINY @@ -5590,6 +6534,26 @@ static inline int alloc_kmem_cache_cpus(struct kmem_= cache *s) } #endif /* CONFIG_SLUB_TINY */ =20 +static int init_percpu_sheaves(struct kmem_cache *s) +{ + int cpu; + + for_each_possible_cpu(cpu) { + struct slub_percpu_sheaves *pcs; + + pcs =3D per_cpu_ptr(s->cpu_sheaves, cpu); + + local_trylock_init(&pcs->lock); + + pcs->main =3D alloc_empty_sheaf(s, GFP_KERNEL); + + if (!pcs->main) + return -ENOMEM; + } + + return 0; +} + static struct kmem_cache *kmem_cache_node; =20 /* @@ -5625,7 +6589,7 @@ static void early_kmem_cache_node_alloc(int node) slab->freelist =3D get_freepointer(kmem_cache_node, n); slab->inuse =3D 1; kmem_cache_node->node[node] =3D n; - init_kmem_cache_node(n); + init_kmem_cache_node(n, NULL); inc_slabs_node(kmem_cache_node, node, slab->objects); =20 /* @@ -5641,6 +6605,13 @@ static void free_kmem_cache_nodes(struct kmem_cache = *s) struct kmem_cache_node *n; =20 for_each_kmem_cache_node(s, node, n) { + if (n->barn) { + WARN_ON(n->barn->nr_full); + WARN_ON(n->barn->nr_empty); + kfree(n->barn); + n->barn =3D NULL; + } + s->node[node] =3D NULL; kmem_cache_free(kmem_cache_node, n); } @@ -5649,6 +6620,8 @@ static void free_kmem_cache_nodes(struct kmem_cache *= s) void __kmem_cache_release(struct kmem_cache *s) { cache_random_seq_destroy(s); + if (s->cpu_sheaves) + pcs_destroy(s); #ifndef CONFIG_SLUB_TINY free_percpu(s->cpu_slab); #endif @@ -5661,18 +6634,29 @@ static int init_kmem_cache_nodes(struct kmem_cache = *s) =20 for_each_node_mask(node, slab_nodes) { struct kmem_cache_node *n; + struct node_barn *barn =3D NULL; =20 if (slab_state =3D=3D DOWN) { early_kmem_cache_node_alloc(node); continue; } + + if (s->cpu_sheaves) { + barn =3D kmalloc_node(sizeof(*barn), GFP_KERNEL, node); + + if (!barn) + return 0; + } + n =3D kmem_cache_alloc_node(kmem_cache_node, GFP_KERNEL, node); - - if (!n) + if (!n) { + kfree(barn); return 0; + } + + init_kmem_cache_node(n, barn); =20 - init_kmem_cache_node(n); s->node[node] =3D n; } return 1; @@ -5929,6 +6913,8 @@ int __kmem_cache_shutdown(struct kmem_cache *s) flush_all_cpus_locked(s); /* Attempt to free all objects */ for_each_kmem_cache_node(s, node, n) { + if (n->barn) + barn_shrink(s, n->barn); free_partial(s, n); if (n->nr_partial || node_nr_slabs(n)) return 1; @@ -6132,6 +7118,9 @@ static int __kmem_cache_do_shrink(struct kmem_cache *= s) for (i =3D 0; i < SHRINK_PROMOTE_MAX; i++) INIT_LIST_HEAD(promote + i); =20 + if (n->barn) + barn_shrink(s, n->barn); + spin_lock_irqsave(&n->list_lock, flags); =20 /* @@ -6211,12 +7200,24 @@ static int slab_mem_going_online_callback(int nid) */ mutex_lock(&slab_mutex); list_for_each_entry(s, &slab_caches, list) { + struct node_barn *barn =3D NULL; + /* * The structure may already exist if the node was previously * onlined and offlined. */ if (get_node(s, nid)) continue; + + if (s->cpu_sheaves) { + barn =3D kmalloc_node(sizeof(*barn), GFP_KERNEL, nid); + + if (!barn) { + ret =3D -ENOMEM; + goto out; + } + } + /* * XXX: kmem_cache_alloc_node will fallback to other nodes * since memory is not yet available from the node that @@ -6224,10 +7225,13 @@ static int slab_mem_going_online_callback(int nid) */ n =3D kmem_cache_alloc(kmem_cache_node, GFP_KERNEL); if (!n) { + kfree(barn); ret =3D -ENOMEM; goto out; } - init_kmem_cache_node(n); + + init_kmem_cache_node(n, barn); + s->node[nid] =3D n; } /* @@ -6440,6 +7444,17 @@ int do_kmem_cache_create(struct kmem_cache *s, const= char *name, =20 set_cpu_partial(s); =20 + if (args->sheaf_capacity && !IS_ENABLED(CONFIG_SLUB_TINY) + && !(s->flags & SLAB_DEBUG_FLAGS)) { + s->cpu_sheaves =3D alloc_percpu(struct slub_percpu_sheaves); + if (!s->cpu_sheaves) { + err =3D -ENOMEM; + goto out; + } + // TODO: increase capacity to grow slab_sheaf up to next kmalloc size? + s->sheaf_capacity =3D args->sheaf_capacity; + } + #ifdef CONFIG_NUMA s->remote_node_defrag_ratio =3D 1000; #endif @@ -6456,6 +7471,12 @@ int do_kmem_cache_create(struct kmem_cache *s, const= char *name, if (!alloc_kmem_cache_cpus(s)) goto out; =20 + if (s->cpu_sheaves) { + err =3D init_percpu_sheaves(s); + if (err) + goto out; + } + err =3D 0; =20 /* Mutex is not taken during early boot */ @@ -6908,6 +7929,12 @@ static ssize_t order_show(struct kmem_cache *s, char= *buf) } SLAB_ATTR_RO(order); =20 +static ssize_t sheaf_capacity_show(struct kmem_cache *s, char *buf) +{ + return sysfs_emit(buf, "%u\n", s->sheaf_capacity); +} +SLAB_ATTR_RO(sheaf_capacity); + static ssize_t min_partial_show(struct kmem_cache *s, char *buf) { return sysfs_emit(buf, "%lu\n", s->min_partial); @@ -7255,8 +8282,10 @@ static ssize_t text##_store(struct kmem_cache *s, \ } \ SLAB_ATTR(text); \ =20 +STAT_ATTR(ALLOC_PCS, alloc_cpu_sheaf); STAT_ATTR(ALLOC_FASTPATH, alloc_fastpath); STAT_ATTR(ALLOC_SLOWPATH, alloc_slowpath); +STAT_ATTR(FREE_PCS, free_cpu_sheaf); STAT_ATTR(FREE_FASTPATH, free_fastpath); STAT_ATTR(FREE_SLOWPATH, free_slowpath); STAT_ATTR(FREE_FROZEN, free_frozen); @@ -7281,6 +8310,14 @@ STAT_ATTR(CPU_PARTIAL_ALLOC, cpu_partial_alloc); STAT_ATTR(CPU_PARTIAL_FREE, cpu_partial_free); STAT_ATTR(CPU_PARTIAL_NODE, cpu_partial_node); STAT_ATTR(CPU_PARTIAL_DRAIN, cpu_partial_drain); +STAT_ATTR(SHEAF_FLUSH, sheaf_flush); +STAT_ATTR(SHEAF_REFILL, sheaf_refill); +STAT_ATTR(SHEAF_ALLOC, sheaf_alloc); +STAT_ATTR(SHEAF_FREE, sheaf_free); +STAT_ATTR(BARN_GET, barn_get); +STAT_ATTR(BARN_GET_FAIL, barn_get_fail); +STAT_ATTR(BARN_PUT, barn_put); +STAT_ATTR(BARN_PUT_FAIL, barn_put_fail); #endif /* CONFIG_SLUB_STATS */ =20 #ifdef CONFIG_KFENCE @@ -7311,6 +8348,7 @@ static struct attribute *slab_attrs[] =3D { &object_size_attr.attr, &objs_per_slab_attr.attr, &order_attr.attr, + &sheaf_capacity_attr.attr, &min_partial_attr.attr, &cpu_partial_attr.attr, &objects_partial_attr.attr, @@ -7342,8 +8380,10 @@ static struct attribute *slab_attrs[] =3D { &remote_node_defrag_ratio_attr.attr, #endif #ifdef CONFIG_SLUB_STATS + &alloc_cpu_sheaf_attr.attr, &alloc_fastpath_attr.attr, &alloc_slowpath_attr.attr, + &free_cpu_sheaf_attr.attr, &free_fastpath_attr.attr, &free_slowpath_attr.attr, &free_frozen_attr.attr, @@ -7368,6 +8408,14 @@ static struct attribute *slab_attrs[] =3D { &cpu_partial_free_attr.attr, &cpu_partial_node_attr.attr, &cpu_partial_drain_attr.attr, + &sheaf_flush_attr.attr, + &sheaf_refill_attr.attr, + &sheaf_alloc_attr.attr, + &sheaf_free_attr.attr, + &barn_get_attr.attr, + &barn_get_fail_attr.attr, + &barn_put_attr.attr, + &barn_put_fail_attr.attr, #endif #ifdef CONFIG_FAILSLAB &failslab_attr.attr, --=20 2.51.0 From nobody Fri Oct 3 07:42:42 2025 Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.223.130]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CC54B302CDF for ; Wed, 3 Sep 2025 13:01:01 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=195.135.223.130 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1756904463; cv=none; b=KVgTgBQ37wbinGfer1FJ2JhAu4b/5W1bndBUy2CcAobdW2FZAQi+ij6narGDGbBHoYaWdUv+29QE/ptxVgklfqFmK3czen+vKe8fylth++g6YyFWCUOd4gzaNFIMeYJ0K0RcwzryJT1APwjRy0P6oPU+jZ9wi67Yr0QSqAXTvtg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1756904463; c=relaxed/simple; bh=Qu3DbhBy7EGA38ZWL9IrWLSmnmbsvsArSbBndE1ffPU=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=OcPbnmyfY2NOo31nFkK5AvneZnF5kggWOqPuqR2Z9/2xYLKf4O8uTn0z966YmnTKP4kKiz9we9m1UwmCIGR0/8mwQGG0VxT1nY+ALsrrPU797EpEukAFpiGhNnQTakOx+CD8j8sERVKYI6BYcSJ8c1QUc1miF6LZvwYe7zhDusc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=suse.cz; spf=pass smtp.mailfrom=suse.cz; dkim=pass (1024-bit key) header.d=suse.cz header.i=@suse.cz header.b=eZN9GeJP; dkim=permerror (0-bit key) header.d=suse.cz header.i=@suse.cz header.b=iAV3Bxo3; dkim=pass (1024-bit key) header.d=suse.cz header.i=@suse.cz header.b=A1v4I9i6; dkim=permerror (0-bit key) header.d=suse.cz header.i=@suse.cz header.b=YMDwB/Kt; arc=none smtp.client-ip=195.135.223.130 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=suse.cz Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=suse.cz Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=suse.cz header.i=@suse.cz header.b="eZN9GeJP"; dkim=permerror (0-bit key) header.d=suse.cz header.i=@suse.cz header.b="iAV3Bxo3"; dkim=pass (1024-bit key) header.d=suse.cz header.i=@suse.cz header.b="A1v4I9i6"; dkim=permerror (0-bit key) header.d=suse.cz header.i=@suse.cz header.b="YMDwB/Kt" Received: from imap1.dmz-prg2.suse.org (imap1.dmz-prg2.suse.org [IPv6:2a07:de40:b281:104:10:150:64:97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 7B3DE21225; Wed, 3 Sep 2025 13:00:09 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1756904410; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=WHNZEYi9bvdu6nDqIqDNg2B6UR4wge7RdUooBFAPhWo=; b=eZN9GeJPhRRPdRpmi1n0tbVJR0iTRmAv6QHL6zBSDNYz4ruAgOCIpoZpeedgRnVUU0bINv mZPE41l2pNjwDPO47u4pMtvc7+RzPHFIqWQPwJDvmnqX0u+4ar4UUvoqXhBZ9hJ+h65hfy YhQxuU1nE//R15R+uyDzDoWnu+f5CJA= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1756904410; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=WHNZEYi9bvdu6nDqIqDNg2B6UR4wge7RdUooBFAPhWo=; b=iAV3Bxo3pliTFUWFDM5jZea+ATJqVIzU9QilYsofcOjt5WoahEt/zEMMOvMF79o5eCZbz+ KqOsJ04LQkRMqPAA== Authentication-Results: smtp-out1.suse.de; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=A1v4I9i6; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b="YMDwB/Kt" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1756904409; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=WHNZEYi9bvdu6nDqIqDNg2B6UR4wge7RdUooBFAPhWo=; b=A1v4I9i6ducf211y79IWDB7D1LUKJv0UisWd/0vIZJFIJSgnwIb2MfInaLoTEGKISK0I/m v1PT73NixhMPxDwR7WHi6JjthxYksM7eXpUc7WPPf/3YpooIdQ8Fyc0epX2e4fnmvey/Yj VN2e9uK9rk8XV6n5QYIhkQ7UkJo/zog= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1756904409; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=WHNZEYi9bvdu6nDqIqDNg2B6UR4wge7RdUooBFAPhWo=; b=YMDwB/KtKJx5/4nORybCyIAoTVQXoy8DVDKWH7FVnqZOYJCJ3A6gkJ4eI1/kmy21uab90w ZSQptfzrqiVA2uCQ== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id 61D8113AF8; Wed, 3 Sep 2025 13:00:09 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id uAObF9k7uGitOAAAD6G6ig (envelope-from ); Wed, 03 Sep 2025 13:00:09 +0000 From: Vlastimil Babka Date: Wed, 03 Sep 2025 14:59:46 +0200 Subject: [PATCH v7 04/21] slab: add sheaf support for batching kfree_rcu() operations Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20250903-slub-percpu-caches-v7-4-71c114cdefef@suse.cz> References: <20250903-slub-percpu-caches-v7-0-71c114cdefef@suse.cz> In-Reply-To: <20250903-slub-percpu-caches-v7-0-71c114cdefef@suse.cz> To: Suren Baghdasaryan , "Liam R. Howlett" , Christoph Lameter , David Rientjes Cc: Roman Gushchin , Harry Yoo , Uladzislau Rezki , Sidhartha Kumar , linux-mm@kvack.org, linux-kernel@vger.kernel.org, rcu@vger.kernel.org, maple-tree@lists.infradead.org, vbabka@suse.cz X-Mailer: b4 0.14.2 X-Spamd-Result: default: False [-4.51 / 50.00]; BAYES_HAM(-3.00)[100.00%]; NEURAL_HAM_LONG(-1.00)[-1.000]; R_DKIM_ALLOW(-0.20)[suse.cz:s=susede2_rsa,suse.cz:s=susede2_ed25519]; NEURAL_HAM_SHORT(-0.20)[-1.000]; MIME_GOOD(-0.10)[text/plain]; MX_GOOD(-0.01)[]; DKIM_SIGNED(0.00)[suse.cz:s=susede2_rsa,suse.cz:s=susede2_ed25519]; RBL_SPAMHAUS_BLOCKED_OPENRESOLVER(0.00)[2a07:de40:b281:104:10:150:64:97:from]; FUZZY_RATELIMITED(0.00)[rspamd.com]; ARC_NA(0.00)[]; RCPT_COUNT_TWELVE(0.00)[13]; TO_MATCH_ENVRCPT_ALL(0.00)[]; MIME_TRACE(0.00)[0:+]; SPAMHAUS_XBL(0.00)[2a07:de40:b281:104:10:150:64:97:from]; FREEMAIL_ENVRCPT(0.00)[gmail.com]; FREEMAIL_CC(0.00)[linux.dev,oracle.com,gmail.com,kvack.org,vger.kernel.org,lists.infradead.org,suse.cz]; RCVD_TLS_ALL(0.00)[]; RCVD_COUNT_TWO(0.00)[2]; FROM_EQ_ENVFROM(0.00)[]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; MID_RHS_MATCH_FROM(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; RECEIVED_SPAMHAUS_BLOCKED_OPENRESOLVER(0.00)[2a07:de40:b281:106:10:150:64:167:received]; DKIM_TRACE(0.00)[suse.cz:+]; R_RATELIMIT(0.00)[to_ip_from(RLfsjnp7neds983g95ihcnuzgq)]; DBL_BLOCKED_OPENRESOLVER(0.00)[oracle.com:email,suse.cz:mid,suse.cz:dkim,suse.cz:email,imap1.dmz-prg2.suse.org:helo,imap1.dmz-prg2.suse.org:rdns] X-Spam-Flag: NO X-Spam-Level: X-Rspamd-Queue-Id: 7B3DE21225 X-Rspamd-Server: rspamd2.dmz-prg2.suse.org X-Rspamd-Action: no action X-Spam-Score: -4.51 Extend the sheaf infrastructure for more efficient kfree_rcu() handling. For caches with sheaves, on each cpu maintain a rcu_free sheaf in addition to main and spare sheaves. kfree_rcu() operations will try to put objects on this sheaf. Once full, the sheaf is detached and submitted to call_rcu() with a handler that will try to put it in the barn, or flush to slab pages using bulk free, when the barn is full. Then a new empty sheaf must be obtained to put more objects there. It's possible that no free sheaves are available to use for a new rcu_free sheaf, and the allocation in kfree_rcu() context can only use GFP_NOWAIT and thus may fail. In that case, fall back to the existing kfree_rcu() implementation. Expected advantages: - batching the kfree_rcu() operations, that could eventually replace the existing batching - sheaves can be reused for allocations via barn instead of being flushed to slabs, which is more efficient - this includes cases where only some cpus are allowed to process rcu callbacks (Android) Possible disadvantage: - objects might be waiting for more than their grace period (it is determined by the last object freed into the sheaf), increasing memory usage - but the existing batching does that too. Only implement this for CONFIG_KVFREE_RCU_BATCHED as the tiny implementation favors smaller memory footprint over performance. Add CONFIG_SLUB_STATS counters free_rcu_sheaf and free_rcu_sheaf_fail to count how many kfree_rcu() used the rcu_free sheaf successfully and how many had to fall back to the existing implementation. Reviewed-by: Harry Yoo Reviewed-by: Suren Baghdasaryan Signed-off-by: Vlastimil Babka Reported-by: Uladzislau Rezki --- mm/slab.h | 2 + mm/slab_common.c | 24 +++++++ mm/slub.c | 192 +++++++++++++++++++++++++++++++++++++++++++++++++++= +++- 3 files changed, 216 insertions(+), 2 deletions(-) diff --git a/mm/slab.h b/mm/slab.h index 206987ce44a4d053ebe3b5e50784d2dd23822cd1..f1866f2d9b211bb0d7f24644b80= ef4b50a7c3d24 100644 --- a/mm/slab.h +++ b/mm/slab.h @@ -435,6 +435,8 @@ static inline bool is_kmalloc_normal(struct kmem_cache = *s) return !(s->flags & (SLAB_CACHE_DMA|SLAB_ACCOUNT|SLAB_RECLAIM_ACCOUNT)); } =20 +bool __kfree_rcu_sheaf(struct kmem_cache *s, void *obj); + #define SLAB_CORE_FLAGS (SLAB_HWCACHE_ALIGN | SLAB_CACHE_DMA | \ SLAB_CACHE_DMA32 | SLAB_PANIC | \ SLAB_TYPESAFE_BY_RCU | SLAB_DEBUG_OBJECTS | \ diff --git a/mm/slab_common.c b/mm/slab_common.c index e2b197e47866c30acdbd1fee4159f262a751c5a7..2d806e02568532a1000fd3912db= 6978e945dcfa8 100644 --- a/mm/slab_common.c +++ b/mm/slab_common.c @@ -1608,6 +1608,27 @@ static void kfree_rcu_work(struct work_struct *work) kvfree_rcu_list(head); } =20 +static bool kfree_rcu_sheaf(void *obj) +{ + struct kmem_cache *s; + struct folio *folio; + struct slab *slab; + + if (is_vmalloc_addr(obj)) + return false; + + folio =3D virt_to_folio(obj); + if (unlikely(!folio_test_slab(folio))) + return false; + + slab =3D folio_slab(folio); + s =3D slab->slab_cache; + if (s->cpu_sheaves) + return __kfree_rcu_sheaf(s, obj); + + return false; +} + static bool need_offload_krc(struct kfree_rcu_cpu *krcp) { @@ -1952,6 +1973,9 @@ void kvfree_call_rcu(struct rcu_head *head, void *ptr) if (!head) might_sleep(); =20 + if (kfree_rcu_sheaf(ptr)) + return; + // Queue the object but don't yet schedule the batch. if (debug_rcu_head_queue(ptr)) { // Probable double kfree_rcu(), just leak. diff --git a/mm/slub.c b/mm/slub.c index 42cb5848f1cecb17174967ff8b102b20a50110e3..6a64478befdebdb44cd7896d673= bd20a7a6e2889 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -367,6 +367,8 @@ enum stat_item { ALLOC_FASTPATH, /* Allocation from cpu slab */ ALLOC_SLOWPATH, /* Allocation by getting a new cpu slab */ FREE_PCS, /* Free to percpu sheaf */ + FREE_RCU_SHEAF, /* Free to rcu_free sheaf */ + FREE_RCU_SHEAF_FAIL, /* Failed to free to a rcu_free sheaf */ FREE_FASTPATH, /* Free to cpu slab */ FREE_SLOWPATH, /* Freeing not to cpu slab */ FREE_FROZEN, /* Freeing to frozen slab */ @@ -461,6 +463,7 @@ struct slab_sheaf { struct rcu_head rcu_head; struct list_head barn_list; }; + struct kmem_cache *cache; unsigned int size; void *objects[]; }; @@ -469,6 +472,7 @@ struct slub_percpu_sheaves { local_trylock_t lock; struct slab_sheaf *main; /* never NULL when unlocked */ struct slab_sheaf *spare; /* empty or full, may be NULL */ + struct slab_sheaf *rcu_free; /* for batching kfree_rcu() */ }; =20 /* @@ -2531,6 +2535,8 @@ static struct slab_sheaf *alloc_empty_sheaf(struct km= em_cache *s, gfp_t gfp) if (unlikely(!sheaf)) return NULL; =20 + sheaf->cache =3D s; + stat(s, SHEAF_ALLOC); =20 return sheaf; @@ -2655,6 +2661,43 @@ static void sheaf_flush_unused(struct kmem_cache *s,= struct slab_sheaf *sheaf) sheaf->size =3D 0; } =20 +static void __rcu_free_sheaf_prepare(struct kmem_cache *s, + struct slab_sheaf *sheaf) +{ + bool init =3D slab_want_init_on_free(s); + void **p =3D &sheaf->objects[0]; + unsigned int i =3D 0; + + while (i < sheaf->size) { + struct slab *slab =3D virt_to_slab(p[i]); + + memcg_slab_free_hook(s, slab, p + i, 1); + alloc_tagging_slab_free_hook(s, slab, p + i, 1); + + if (unlikely(!slab_free_hook(s, p[i], init, true))) { + p[i] =3D p[--sheaf->size]; + continue; + } + + i++; + } +} + +static void rcu_free_sheaf_nobarn(struct rcu_head *head) +{ + struct slab_sheaf *sheaf; + struct kmem_cache *s; + + sheaf =3D container_of(head, struct slab_sheaf, rcu_head); + s =3D sheaf->cache; + + __rcu_free_sheaf_prepare(s, sheaf); + + sheaf_flush_unused(s, sheaf); + + free_empty_sheaf(s, sheaf); +} + /* * Caller needs to make sure migration is disabled in order to fully flush * single cpu's sheaves @@ -2667,7 +2710,7 @@ static void sheaf_flush_unused(struct kmem_cache *s, = struct slab_sheaf *sheaf) static void pcs_flush_all(struct kmem_cache *s) { struct slub_percpu_sheaves *pcs; - struct slab_sheaf *spare; + struct slab_sheaf *spare, *rcu_free; =20 local_lock(&s->cpu_sheaves->lock); pcs =3D this_cpu_ptr(s->cpu_sheaves); @@ -2675,6 +2718,9 @@ static void pcs_flush_all(struct kmem_cache *s) spare =3D pcs->spare; pcs->spare =3D NULL; =20 + rcu_free =3D pcs->rcu_free; + pcs->rcu_free =3D NULL; + local_unlock(&s->cpu_sheaves->lock); =20 if (spare) { @@ -2682,6 +2728,9 @@ static void pcs_flush_all(struct kmem_cache *s) free_empty_sheaf(s, spare); } =20 + if (rcu_free) + call_rcu(&rcu_free->rcu_head, rcu_free_sheaf_nobarn); + sheaf_flush_main(s); } =20 @@ -2698,6 +2747,11 @@ static void __pcs_flush_all_cpu(struct kmem_cache *s= , unsigned int cpu) free_empty_sheaf(s, pcs->spare); pcs->spare =3D NULL; } + + if (pcs->rcu_free) { + call_rcu(&pcs->rcu_free->rcu_head, rcu_free_sheaf_nobarn); + pcs->rcu_free =3D NULL; + } } =20 static void pcs_destroy(struct kmem_cache *s) @@ -2723,6 +2777,7 @@ static void pcs_destroy(struct kmem_cache *s) */ =20 WARN_ON(pcs->spare); + WARN_ON(pcs->rcu_free); =20 if (!WARN_ON(pcs->main->size)) { free_empty_sheaf(s, pcs->main); @@ -3780,7 +3835,7 @@ static bool has_pcs_used(int cpu, struct kmem_cache *= s) =20 pcs =3D per_cpu_ptr(s->cpu_sheaves, cpu); =20 - return (pcs->spare || pcs->main->size); + return (pcs->spare || pcs->rcu_free || pcs->main->size); } =20 static void pcs_flush_all(struct kmem_cache *s); @@ -5415,6 +5470,130 @@ bool free_to_pcs(struct kmem_cache *s, void *object) return true; } =20 +static void rcu_free_sheaf(struct rcu_head *head) +{ + struct slab_sheaf *sheaf; + struct node_barn *barn; + struct kmem_cache *s; + + sheaf =3D container_of(head, struct slab_sheaf, rcu_head); + + s =3D sheaf->cache; + + /* + * This may remove some objects due to slab_free_hook() returning false, + * so that the sheaf might no longer be completely full. But it's easier + * to handle it as full (unless it became completely empty), as the code + * handles it fine. The only downside is that sheaf will serve fewer + * allocations when reused. It only happens due to debugging, which is a + * performance hit anyway. + */ + __rcu_free_sheaf_prepare(s, sheaf); + + barn =3D get_node(s, numa_mem_id())->barn; + + /* due to slab_free_hook() */ + if (unlikely(sheaf->size =3D=3D 0)) + goto empty; + + /* + * Checking nr_full/nr_empty outside lock avoids contention in case the + * barn is at the respective limit. Due to the race we might go over the + * limit but that should be rare and harmless. + */ + + if (data_race(barn->nr_full) < MAX_FULL_SHEAVES) { + stat(s, BARN_PUT); + barn_put_full_sheaf(barn, sheaf); + return; + } + + stat(s, BARN_PUT_FAIL); + sheaf_flush_unused(s, sheaf); + +empty: + if (data_race(barn->nr_empty) < MAX_EMPTY_SHEAVES) { + barn_put_empty_sheaf(barn, sheaf); + return; + } + + free_empty_sheaf(s, sheaf); +} + +bool __kfree_rcu_sheaf(struct kmem_cache *s, void *obj) +{ + struct slub_percpu_sheaves *pcs; + struct slab_sheaf *rcu_sheaf; + + if (!local_trylock(&s->cpu_sheaves->lock)) + goto fail; + + pcs =3D this_cpu_ptr(s->cpu_sheaves); + + if (unlikely(!pcs->rcu_free)) { + + struct slab_sheaf *empty; + struct node_barn *barn; + + if (pcs->spare && pcs->spare->size =3D=3D 0) { + pcs->rcu_free =3D pcs->spare; + pcs->spare =3D NULL; + goto do_free; + } + + barn =3D get_barn(s); + + empty =3D barn_get_empty_sheaf(barn); + + if (empty) { + pcs->rcu_free =3D empty; + goto do_free; + } + + local_unlock(&s->cpu_sheaves->lock); + + empty =3D alloc_empty_sheaf(s, GFP_NOWAIT); + + if (!empty) + goto fail; + + if (!local_trylock(&s->cpu_sheaves->lock)) { + barn_put_empty_sheaf(barn, empty); + goto fail; + } + + pcs =3D this_cpu_ptr(s->cpu_sheaves); + + if (unlikely(pcs->rcu_free)) + barn_put_empty_sheaf(barn, empty); + else + pcs->rcu_free =3D empty; + } + +do_free: + + rcu_sheaf =3D pcs->rcu_free; + + rcu_sheaf->objects[rcu_sheaf->size++] =3D obj; + + if (likely(rcu_sheaf->size < s->sheaf_capacity)) + rcu_sheaf =3D NULL; + else + pcs->rcu_free =3D NULL; + + local_unlock(&s->cpu_sheaves->lock); + + if (rcu_sheaf) + call_rcu(&rcu_sheaf->rcu_head, rcu_free_sheaf); + + stat(s, FREE_RCU_SHEAF); + return true; + +fail: + stat(s, FREE_RCU_SHEAF_FAIL); + return false; +} + /* * Bulk free objects to the percpu sheaves. * Unlike free_to_pcs() this includes the calls to all necessary hooks @@ -6911,6 +7090,11 @@ int __kmem_cache_shutdown(struct kmem_cache *s) struct kmem_cache_node *n; =20 flush_all_cpus_locked(s); + + /* we might have rcu sheaves in flight */ + if (s->cpu_sheaves) + rcu_barrier(); + /* Attempt to free all objects */ for_each_kmem_cache_node(s, node, n) { if (n->barn) @@ -8286,6 +8470,8 @@ STAT_ATTR(ALLOC_PCS, alloc_cpu_sheaf); STAT_ATTR(ALLOC_FASTPATH, alloc_fastpath); STAT_ATTR(ALLOC_SLOWPATH, alloc_slowpath); STAT_ATTR(FREE_PCS, free_cpu_sheaf); +STAT_ATTR(FREE_RCU_SHEAF, free_rcu_sheaf); +STAT_ATTR(FREE_RCU_SHEAF_FAIL, free_rcu_sheaf_fail); STAT_ATTR(FREE_FASTPATH, free_fastpath); STAT_ATTR(FREE_SLOWPATH, free_slowpath); STAT_ATTR(FREE_FROZEN, free_frozen); @@ -8384,6 +8570,8 @@ static struct attribute *slab_attrs[] =3D { &alloc_fastpath_attr.attr, &alloc_slowpath_attr.attr, &free_cpu_sheaf_attr.attr, + &free_rcu_sheaf_attr.attr, + &free_rcu_sheaf_fail_attr.attr, &free_fastpath_attr.attr, &free_slowpath_attr.attr, &free_frozen_attr.attr, --=20 2.51.0 From nobody Fri Oct 3 07:42:42 2025 Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.223.130]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 96DA3302774 for ; Wed, 3 Sep 2025 13:00:52 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=195.135.223.130 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1756904454; cv=none; b=US4b7399CE+qVNtiDceKFrTcDE6i7g8V/ry5Yq7Hxc6yloka5Jn8ifTFvuQNuj5lKOSTkO+wo/rYNRJ9svO2V/CFeFmfW8+PwZ3fsou+uUvhlQCVJMmbJpXl+ESVFLe4gU5CzNS1w6hg49jx/xaocRFZZK/x4MJuO5vgZOYlW9c= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1756904454; c=relaxed/simple; bh=1NGtCOtJS6H3niSiHhtmHxjJWn3v2mTVVENb68msBf0=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=mhPSsbsrFtIvDJUnRz126mHYIyFByJFDdSkujD+N7+8hQk+Dx7pEMVROQgH+TyecOc3B3NiL5xYlmHHRK7xzTzXDuBQeKJ7a+T6RjwQQoujodP8Q2QZz35uHk73kSjIrMKCKZuH16G9hzwiS6GKg8aweycsM8zl9oOqc6fpfGAw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=suse.cz; spf=pass smtp.mailfrom=suse.cz; dkim=pass (1024-bit key) header.d=suse.cz header.i=@suse.cz header.b=xsnu2+dp; dkim=permerror (0-bit key) header.d=suse.cz header.i=@suse.cz header.b=Az+ZwlrV; dkim=pass (1024-bit key) header.d=suse.cz header.i=@suse.cz header.b=c1PzccVX; dkim=permerror (0-bit key) header.d=suse.cz header.i=@suse.cz header.b=MBmXp8wV; arc=none smtp.client-ip=195.135.223.130 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=suse.cz Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=suse.cz Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=suse.cz header.i=@suse.cz header.b="xsnu2+dp"; dkim=permerror (0-bit key) header.d=suse.cz header.i=@suse.cz header.b="Az+ZwlrV"; dkim=pass (1024-bit key) header.d=suse.cz header.i=@suse.cz header.b="c1PzccVX"; dkim=permerror (0-bit key) header.d=suse.cz header.i=@suse.cz header.b="MBmXp8wV" Received: from imap1.dmz-prg2.suse.org (unknown [10.150.64.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 9159D21226; Wed, 3 Sep 2025 13:00:09 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1756904410; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=XowEDRPcnu9zBkZZk2GOq1SOnuNsIB8xPcy0i2fV6rg=; b=xsnu2+dp82jv3vRmfK3EgMODxDyVHw/SeNURqn0EwKfiTgtjZsEyCxswt5L3Fw9u/wKdEL f9oUtSibOCraEwsItY9ly2RIuC3Y3b9coYmu10IHY2YhTXS7GfLD8STh76AxbE2Ep24ihy BzJEvsFX9cWxLyld5WyZc3atx3IWO/U= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1756904410; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=XowEDRPcnu9zBkZZk2GOq1SOnuNsIB8xPcy0i2fV6rg=; b=Az+ZwlrVxO4ZHUhtnaSLNE2C4Iu2FtabUDXFMP4Y1jKVkf1HU1oeaafF2n2+T66mVhkmov TIXYdbTI1uVLxPAA== Authentication-Results: smtp-out1.suse.de; none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1756904409; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=XowEDRPcnu9zBkZZk2GOq1SOnuNsIB8xPcy0i2fV6rg=; b=c1PzccVXJST0/odDaj4h/cm7TRwcsK8Xm8W0hNwEPRUBh30qNpzG4GQQqbS/5xGIZYrxj5 aJobttbAQUvqybj80RH5yZUtD2F1NtbPD+EEKwDvUw+Pws3kzv/5yiPmyzbrlvgdkdySiZ TcVPQyHGM19O8E5S8EkUfgcQCjtCQPY= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1756904409; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=XowEDRPcnu9zBkZZk2GOq1SOnuNsIB8xPcy0i2fV6rg=; b=MBmXp8wV9fUwSuyB791gnavyQq1BPpYsvi9/ruJk8T7Qv7uJA7UCggPbXbpVaN/URORf6X eEHDaYLkLG/9JeDQ== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id 76C6013888; Wed, 3 Sep 2025 13:00:09 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id AB+1HNk7uGitOAAAD6G6ig (envelope-from ); Wed, 03 Sep 2025 13:00:09 +0000 From: Vlastimil Babka Date: Wed, 03 Sep 2025 14:59:47 +0200 Subject: [PATCH v7 05/21] slab: sheaf prefilling for guaranteed allocations Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20250903-slub-percpu-caches-v7-5-71c114cdefef@suse.cz> References: <20250903-slub-percpu-caches-v7-0-71c114cdefef@suse.cz> In-Reply-To: <20250903-slub-percpu-caches-v7-0-71c114cdefef@suse.cz> To: Suren Baghdasaryan , "Liam R. Howlett" , Christoph Lameter , David Rientjes Cc: Roman Gushchin , Harry Yoo , Uladzislau Rezki , Sidhartha Kumar , linux-mm@kvack.org, linux-kernel@vger.kernel.org, rcu@vger.kernel.org, maple-tree@lists.infradead.org, vbabka@suse.cz X-Mailer: b4 0.14.2 X-Spam-Level: X-Spamd-Result: default: False [-4.30 / 50.00]; BAYES_HAM(-3.00)[100.00%]; NEURAL_HAM_LONG(-1.00)[-1.000]; NEURAL_HAM_SHORT(-0.20)[-0.999]; MIME_GOOD(-0.10)[text/plain]; RCVD_VIA_SMTP_AUTH(0.00)[]; ARC_NA(0.00)[]; FREEMAIL_ENVRCPT(0.00)[gmail.com]; MIME_TRACE(0.00)[0:+]; FUZZY_RATELIMITED(0.00)[rspamd.com]; TO_DN_SOME(0.00)[]; RCPT_COUNT_TWELVE(0.00)[13]; RCVD_TLS_ALL(0.00)[]; MID_RHS_MATCH_FROM(0.00)[]; DKIM_SIGNED(0.00)[suse.cz:s=susede2_rsa,suse.cz:s=susede2_ed25519]; FROM_HAS_DN(0.00)[]; FREEMAIL_CC(0.00)[linux.dev,oracle.com,gmail.com,kvack.org,vger.kernel.org,lists.infradead.org,suse.cz]; R_RATELIMIT(0.00)[to_ip_from(RLwn5r54y1cp81no5tmbbew5oc)]; FROM_EQ_ENVFROM(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; RCVD_COUNT_TWO(0.00)[2]; DBL_BLOCKED_OPENRESOLVER(0.00)[oracle.com:email,suse.cz:email,suse.cz:mid,imap1.dmz-prg2.suse.org:helo] X-Spam-Flag: NO X-Spam-Score: -4.30 Add functions for efficient guaranteed allocations e.g. in a critical section that cannot sleep, when the exact number of allocations is not known beforehand, but an upper limit can be calculated. kmem_cache_prefill_sheaf() returns a sheaf containing at least given number of objects. kmem_cache_alloc_from_sheaf() will allocate an object from the sheaf and is guaranteed not to fail until depleted. kmem_cache_return_sheaf() is for giving the sheaf back to the slab allocator after the critical section. This will also attempt to refill it to cache's sheaf capacity for better efficiency of sheaves handling, but it's not stricly necessary to succeed. kmem_cache_refill_sheaf() can be used to refill a previously obtained sheaf to requested size. If the current size is sufficient, it does nothing. If the requested size exceeds cache's sheaf_capacity and the sheaf's current capacity, the sheaf will be replaced with a new one, hence the indirect pointer parameter. kmem_cache_sheaf_size() can be used to query the current size. The implementation supports requesting sizes that exceed cache's sheaf_capacity, but it is not efficient - such "oversize" sheaves are allocated fresh in kmem_cache_prefill_sheaf() and flushed and freed immediately by kmem_cache_return_sheaf(). kmem_cache_refill_sheaf() might be especially ineffective when replacing a sheaf with a new one of a larger capacity. It is therefore better to size cache's sheaf_capacity accordingly to make oversize sheaves exceptional. CONFIG_SLUB_STATS counters are added for sheaf prefill and return operations. A prefill or return is considered _fast when it is able to grab or return a percpu spare sheaf (even if the sheaf needs a refill to satisfy the request, as those should amortize over time), and _slow otherwise (when the barn or even sheaf allocation/freeing has to be involved). sheaf_prefill_oversize is provided to determine how many prefills were oversize (counter for oversize returns is not necessary as all oversize refills result in oversize returns). When slub_debug is enabled for a cache with sheaves, no percpu sheaves exist for it, but the prefill functionality is still provided simply by all prefilled sheaves becoming oversize. If percpu sheaves are not created for a cache due to not passing the sheaf_capacity argument on cache creation, the prefills also work through oversize sheaves, but there's a WARN_ON_ONCE() to indicate the omission. Reviewed-by: Suren Baghdasaryan Reviewed-by: Harry Yoo Signed-off-by: Vlastimil Babka --- include/linux/slab.h | 16 ++++ mm/slub.c | 263 +++++++++++++++++++++++++++++++++++++++++++++++= ++++ 2 files changed, 279 insertions(+) diff --git a/include/linux/slab.h b/include/linux/slab.h index 49acbcdc6696fd120c402adf757b3f41660ad50a..680193356ac7a22f9df5cd9b71f= f8b81e26404ad 100644 --- a/include/linux/slab.h +++ b/include/linux/slab.h @@ -829,6 +829,22 @@ void *kmem_cache_alloc_node_noprof(struct kmem_cache *= s, gfp_t flags, int node) __assume_slab_alignment __malloc; #define kmem_cache_alloc_node(...) alloc_hooks(kmem_cache_alloc_node_nopro= f(__VA_ARGS__)) =20 +struct slab_sheaf * +kmem_cache_prefill_sheaf(struct kmem_cache *s, gfp_t gfp, unsigned int siz= e); + +int kmem_cache_refill_sheaf(struct kmem_cache *s, gfp_t gfp, + struct slab_sheaf **sheafp, unsigned int size); + +void kmem_cache_return_sheaf(struct kmem_cache *s, gfp_t gfp, + struct slab_sheaf *sheaf); + +void *kmem_cache_alloc_from_sheaf_noprof(struct kmem_cache *cachep, gfp_t = gfp, + struct slab_sheaf *sheaf) __assume_slab_alignment __malloc; +#define kmem_cache_alloc_from_sheaf(...) \ + alloc_hooks(kmem_cache_alloc_from_sheaf_noprof(__VA_ARGS__)) + +unsigned int kmem_cache_sheaf_size(struct slab_sheaf *sheaf); + /* * These macros allow declaring a kmem_buckets * parameter alongside size,= which * can be compiled out with CONFIG_SLAB_BUCKETS=3Dn so that a large number= of call diff --git a/mm/slub.c b/mm/slub.c index 6a64478befdebdb44cd7896d673bd20a7a6e2889..c6ca9b60acd15520410ac08d252= bb09e111db6f1 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -401,6 +401,11 @@ enum stat_item { BARN_GET_FAIL, /* Failed to get full sheaf from barn */ BARN_PUT, /* Put full sheaf to barn */ BARN_PUT_FAIL, /* Failed to put full sheaf to barn */ + SHEAF_PREFILL_FAST, /* Sheaf prefill grabbed the spare sheaf */ + SHEAF_PREFILL_SLOW, /* Sheaf prefill found no spare sheaf */ + SHEAF_PREFILL_OVERSIZE, /* Allocation of oversize sheaf for prefill */ + SHEAF_RETURN_FAST, /* Sheaf return reattached spare sheaf */ + SHEAF_RETURN_SLOW, /* Sheaf return could not reattach spare */ NR_SLUB_STAT_ITEMS }; =20 @@ -462,6 +467,8 @@ struct slab_sheaf { union { struct rcu_head rcu_head; struct list_head barn_list; + /* only used for prefilled sheafs */ + unsigned int capacity; }; struct kmem_cache *cache; unsigned int size; @@ -2838,6 +2845,30 @@ static void barn_put_full_sheaf(struct node_barn *ba= rn, struct slab_sheaf *sheaf spin_unlock_irqrestore(&barn->lock, flags); } =20 +static struct slab_sheaf *barn_get_full_or_empty_sheaf(struct node_barn *b= arn) +{ + struct slab_sheaf *sheaf =3D NULL; + unsigned long flags; + + spin_lock_irqsave(&barn->lock, flags); + + if (barn->nr_full) { + sheaf =3D list_first_entry(&barn->sheaves_full, struct slab_sheaf, + barn_list); + list_del(&sheaf->barn_list); + barn->nr_full--; + } else if (barn->nr_empty) { + sheaf =3D list_first_entry(&barn->sheaves_empty, + struct slab_sheaf, barn_list); + list_del(&sheaf->barn_list); + barn->nr_empty--; + } + + spin_unlock_irqrestore(&barn->lock, flags); + + return sheaf; +} + /* * If a full sheaf is available, return it and put the supplied empty one = to * barn. We ignore the limit on empty sheaves as the number of sheaves doe= sn't @@ -4970,6 +5001,228 @@ void *kmem_cache_alloc_node_noprof(struct kmem_cach= e *s, gfp_t gfpflags, int nod } EXPORT_SYMBOL(kmem_cache_alloc_node_noprof); =20 +/* + * returns a sheaf that has at least the requested size + * when prefilling is needed, do so with given gfp flags + * + * return NULL if sheaf allocation or prefilling failed + */ +struct slab_sheaf * +kmem_cache_prefill_sheaf(struct kmem_cache *s, gfp_t gfp, unsigned int siz= e) +{ + struct slub_percpu_sheaves *pcs; + struct slab_sheaf *sheaf =3D NULL; + + if (unlikely(size > s->sheaf_capacity)) { + + /* + * slab_debug disables cpu sheaves intentionally so all + * prefilled sheaves become "oversize" and we give up on + * performance for the debugging. Same with SLUB_TINY. + * Creating a cache without sheaves and then requesting a + * prefilled sheaf is however not expected, so warn. + */ + WARN_ON_ONCE(s->sheaf_capacity =3D=3D 0 && + !IS_ENABLED(CONFIG_SLUB_TINY) && + !(s->flags & SLAB_DEBUG_FLAGS)); + + sheaf =3D kzalloc(struct_size(sheaf, objects, size), gfp); + if (!sheaf) + return NULL; + + stat(s, SHEAF_PREFILL_OVERSIZE); + sheaf->cache =3D s; + sheaf->capacity =3D size; + + if (!__kmem_cache_alloc_bulk(s, gfp, size, + &sheaf->objects[0])) { + kfree(sheaf); + return NULL; + } + + sheaf->size =3D size; + + return sheaf; + } + + local_lock(&s->cpu_sheaves->lock); + pcs =3D this_cpu_ptr(s->cpu_sheaves); + + if (pcs->spare) { + sheaf =3D pcs->spare; + pcs->spare =3D NULL; + stat(s, SHEAF_PREFILL_FAST); + } else { + stat(s, SHEAF_PREFILL_SLOW); + sheaf =3D barn_get_full_or_empty_sheaf(get_barn(s)); + if (sheaf && sheaf->size) + stat(s, BARN_GET); + else + stat(s, BARN_GET_FAIL); + } + + local_unlock(&s->cpu_sheaves->lock); + + + if (!sheaf) + sheaf =3D alloc_empty_sheaf(s, gfp); + + if (sheaf && sheaf->size < size) { + if (refill_sheaf(s, sheaf, gfp)) { + sheaf_flush_unused(s, sheaf); + free_empty_sheaf(s, sheaf); + sheaf =3D NULL; + } + } + + if (sheaf) + sheaf->capacity =3D s->sheaf_capacity; + + return sheaf; +} + +/* + * Use this to return a sheaf obtained by kmem_cache_prefill_sheaf() + * + * If the sheaf cannot simply become the percpu spare sheaf, but there's s= pace + * for a full sheaf in the barn, we try to refill the sheaf back to the ca= che's + * sheaf_capacity to avoid handling partially full sheaves. + * + * If the refill fails because gfp is e.g. GFP_NOWAIT, or the barn is full= , the + * sheaf is instead flushed and freed. + */ +void kmem_cache_return_sheaf(struct kmem_cache *s, gfp_t gfp, + struct slab_sheaf *sheaf) +{ + struct slub_percpu_sheaves *pcs; + struct node_barn *barn; + + if (unlikely(sheaf->capacity !=3D s->sheaf_capacity)) { + sheaf_flush_unused(s, sheaf); + kfree(sheaf); + return; + } + + local_lock(&s->cpu_sheaves->lock); + pcs =3D this_cpu_ptr(s->cpu_sheaves); + barn =3D get_barn(s); + + if (!pcs->spare) { + pcs->spare =3D sheaf; + sheaf =3D NULL; + stat(s, SHEAF_RETURN_FAST); + } + + local_unlock(&s->cpu_sheaves->lock); + + if (!sheaf) + return; + + stat(s, SHEAF_RETURN_SLOW); + + /* + * If the barn has too many full sheaves or we fail to refill the sheaf, + * simply flush and free it. + */ + if (data_race(barn->nr_full) >=3D MAX_FULL_SHEAVES || + refill_sheaf(s, sheaf, gfp)) { + sheaf_flush_unused(s, sheaf); + free_empty_sheaf(s, sheaf); + return; + } + + barn_put_full_sheaf(barn, sheaf); + stat(s, BARN_PUT); +} + +/* + * refill a sheaf previously returned by kmem_cache_prefill_sheaf to at le= ast + * the given size + * + * the sheaf might be replaced by a new one when requesting more than + * s->sheaf_capacity objects if such replacement is necessary, but the ref= ill + * fails (returning -ENOMEM), the existing sheaf is left intact + * + * In practice we always refill to full sheaf's capacity. + */ +int kmem_cache_refill_sheaf(struct kmem_cache *s, gfp_t gfp, + struct slab_sheaf **sheafp, unsigned int size) +{ + struct slab_sheaf *sheaf; + + /* + * TODO: do we want to support *sheaf =3D=3D NULL to be equivalent of + * kmem_cache_prefill_sheaf() ? + */ + if (!sheafp || !(*sheafp)) + return -EINVAL; + + sheaf =3D *sheafp; + if (sheaf->size >=3D size) + return 0; + + if (likely(sheaf->capacity >=3D size)) { + if (likely(sheaf->capacity =3D=3D s->sheaf_capacity)) + return refill_sheaf(s, sheaf, gfp); + + if (!__kmem_cache_alloc_bulk(s, gfp, sheaf->capacity - sheaf->size, + &sheaf->objects[sheaf->size])) { + return -ENOMEM; + } + sheaf->size =3D sheaf->capacity; + + return 0; + } + + /* + * We had a regular sized sheaf and need an oversize one, or we had an + * oversize one already but need a larger one now. + * This should be a very rare path so let's not complicate it. + */ + sheaf =3D kmem_cache_prefill_sheaf(s, gfp, size); + if (!sheaf) + return -ENOMEM; + + kmem_cache_return_sheaf(s, gfp, *sheafp); + *sheafp =3D sheaf; + return 0; +} + +/* + * Allocate from a sheaf obtained by kmem_cache_prefill_sheaf() + * + * Guaranteed not to fail as many allocations as was the requested size. + * After the sheaf is emptied, it fails - no fallback to the slab cache it= self. + * + * The gfp parameter is meant only to specify __GFP_ZERO or __GFP_ACCOUNT + * memcg charging is forced over limit if necessary, to avoid failure. + */ +void * +kmem_cache_alloc_from_sheaf_noprof(struct kmem_cache *s, gfp_t gfp, + struct slab_sheaf *sheaf) +{ + void *ret =3D NULL; + bool init; + + if (sheaf->size =3D=3D 0) + goto out; + + ret =3D sheaf->objects[--sheaf->size]; + + init =3D slab_want_init_on_alloc(gfp, s); + + /* add __GFP_NOFAIL to force successful memcg charging */ + slab_post_alloc_hook(s, NULL, gfp | __GFP_NOFAIL, 1, &ret, init, s->objec= t_size); +out: + trace_kmem_cache_alloc(_RET_IP_, ret, s, gfp, NUMA_NO_NODE); + + return ret; +} + +unsigned int kmem_cache_sheaf_size(struct slab_sheaf *sheaf) +{ + return sheaf->size; +} /* * To avoid unnecessary overhead, we pass through large allocation requests * directly to the page allocator. We use __GFP_COMP, because we will need= to @@ -8504,6 +8757,11 @@ STAT_ATTR(BARN_GET, barn_get); STAT_ATTR(BARN_GET_FAIL, barn_get_fail); STAT_ATTR(BARN_PUT, barn_put); STAT_ATTR(BARN_PUT_FAIL, barn_put_fail); +STAT_ATTR(SHEAF_PREFILL_FAST, sheaf_prefill_fast); +STAT_ATTR(SHEAF_PREFILL_SLOW, sheaf_prefill_slow); +STAT_ATTR(SHEAF_PREFILL_OVERSIZE, sheaf_prefill_oversize); +STAT_ATTR(SHEAF_RETURN_FAST, sheaf_return_fast); +STAT_ATTR(SHEAF_RETURN_SLOW, sheaf_return_slow); #endif /* CONFIG_SLUB_STATS */ =20 #ifdef CONFIG_KFENCE @@ -8604,6 +8862,11 @@ static struct attribute *slab_attrs[] =3D { &barn_get_fail_attr.attr, &barn_put_attr.attr, &barn_put_fail_attr.attr, + &sheaf_prefill_fast_attr.attr, + &sheaf_prefill_slow_attr.attr, + &sheaf_prefill_oversize_attr.attr, + &sheaf_return_fast_attr.attr, + &sheaf_return_slow_attr.attr, #endif #ifdef CONFIG_FAILSLAB &failslab_attr.attr, --=20 2.51.0 From nobody Fri Oct 3 07:42:42 2025 Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.223.130]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8539A301027 for ; Wed, 3 Sep 2025 13:00:42 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=195.135.223.130 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1756904445; cv=none; b=qjh53EcAlQvuS0GhXgbS1/iBXLZMB6z+Q2AFenXZWmmP1n/WIiq8vwKZfSaovfmzQgI/BhuwjPSRObHSMwrORuMjg/DBPJ/Pys8x1+fvPDzFqT6rJFg+bPbWmgB+qBubAXi/uD+jjg5QFVYw58kmP9FtlBDYnystbxJgJJueY08= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1756904445; c=relaxed/simple; bh=7CCSDJsDSo53rJ2Xve2xGVYoT/NsP8GXnfH4mSKfrok=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=ufZzENv0DM6SV0Non8HtZOxx2CU3aVyQCYIgjSCAub8V0C+wrvuVz3i3duu/dok8uwvWtn/UAnU5Ra1R5aIPDjxR5PMAKgLVAnEamXaYlHw7xhtKVPxF70NH07FvObhJv0mYzLOufvPODqjHWIw3PC+ZBV2iNqUTRkrB8ULXAD0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=suse.cz; spf=pass smtp.mailfrom=suse.cz; dkim=pass (1024-bit key) header.d=suse.cz header.i=@suse.cz header.b=LoETShIM; dkim=permerror (0-bit key) header.d=suse.cz header.i=@suse.cz header.b=MyQghC5D; dkim=pass (1024-bit key) header.d=suse.cz header.i=@suse.cz header.b=LoETShIM; dkim=permerror (0-bit key) header.d=suse.cz header.i=@suse.cz header.b=MyQghC5D; arc=none smtp.client-ip=195.135.223.130 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=suse.cz Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=suse.cz Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=suse.cz header.i=@suse.cz header.b="LoETShIM"; dkim=permerror (0-bit key) header.d=suse.cz header.i=@suse.cz header.b="MyQghC5D"; dkim=pass (1024-bit key) header.d=suse.cz header.i=@suse.cz header.b="LoETShIM"; dkim=permerror (0-bit key) header.d=suse.cz header.i=@suse.cz header.b="MyQghC5D" Received: from imap1.dmz-prg2.suse.org (imap1.dmz-prg2.suse.org [IPv6:2a07:de40:b281:104:10:150:64:97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 4F49F21227; Wed, 3 Sep 2025 13:00:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1756904410; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=GvueOpBRqUSJBVmPF3IuGLVWneYsx8iJYPivjgPVl+E=; b=LoETShIMk0pCx5L816uJyvFy7c4G6hgcTgDNeP0wdw3yv0H1gKUdk/cqAYnCOz9H7IfMBQ HDr9bFcRAknhxCqHENhCd3m3YzDFv71MZmJZCdVgC1HKbU0Y75RhdAXAhziIlsleurg51T VNyNCxItm6H6usKuYzzlrfROk7g2Vzw= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1756904410; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=GvueOpBRqUSJBVmPF3IuGLVWneYsx8iJYPivjgPVl+E=; b=MyQghC5DsjfSLaNXl3OCffrJh1mZIzymp5cPWwwkaVoofGSGp5XZAZ+nQGdVjDXWQmPKlF PfA/Q+4MQ1QPxkDQ== Authentication-Results: smtp-out1.suse.de; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=LoETShIM; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=MyQghC5D DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1756904410; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=GvueOpBRqUSJBVmPF3IuGLVWneYsx8iJYPivjgPVl+E=; b=LoETShIMk0pCx5L816uJyvFy7c4G6hgcTgDNeP0wdw3yv0H1gKUdk/cqAYnCOz9H7IfMBQ HDr9bFcRAknhxCqHENhCd3m3YzDFv71MZmJZCdVgC1HKbU0Y75RhdAXAhziIlsleurg51T VNyNCxItm6H6usKuYzzlrfROk7g2Vzw= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1756904410; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=GvueOpBRqUSJBVmPF3IuGLVWneYsx8iJYPivjgPVl+E=; b=MyQghC5DsjfSLaNXl3OCffrJh1mZIzymp5cPWwwkaVoofGSGp5XZAZ+nQGdVjDXWQmPKlF PfA/Q+4MQ1QPxkDQ== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id 8B0FB13AFB; Wed, 3 Sep 2025 13:00:09 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id YPyuIdk7uGitOAAAD6G6ig (envelope-from ); Wed, 03 Sep 2025 13:00:09 +0000 From: Vlastimil Babka Date: Wed, 03 Sep 2025 14:59:48 +0200 Subject: [PATCH v7 06/21] slab: determine barn status racily outside of lock Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20250903-slub-percpu-caches-v7-6-71c114cdefef@suse.cz> References: <20250903-slub-percpu-caches-v7-0-71c114cdefef@suse.cz> In-Reply-To: <20250903-slub-percpu-caches-v7-0-71c114cdefef@suse.cz> To: Suren Baghdasaryan , "Liam R. Howlett" , Christoph Lameter , David Rientjes Cc: Roman Gushchin , Harry Yoo , Uladzislau Rezki , Sidhartha Kumar , linux-mm@kvack.org, linux-kernel@vger.kernel.org, rcu@vger.kernel.org, maple-tree@lists.infradead.org, vbabka@suse.cz X-Mailer: b4 0.14.2 X-Spam-Level: X-Spam-Flag: NO X-Rspamd-Queue-Id: 4F49F21227 X-Rspamd-Action: no action X-Rspamd-Server: rspamd1.dmz-prg2.suse.org X-Spamd-Result: default: False [-4.51 / 50.00]; BAYES_HAM(-3.00)[100.00%]; NEURAL_HAM_LONG(-1.00)[-1.000]; R_DKIM_ALLOW(-0.20)[suse.cz:s=susede2_rsa,suse.cz:s=susede2_ed25519]; NEURAL_HAM_SHORT(-0.20)[-1.000]; MIME_GOOD(-0.10)[text/plain]; MX_GOOD(-0.01)[]; DKIM_SIGNED(0.00)[suse.cz:s=susede2_rsa,suse.cz:s=susede2_ed25519]; RBL_SPAMHAUS_BLOCKED_OPENRESOLVER(0.00)[2a07:de40:b281:104:10:150:64:97:from]; FUZZY_RATELIMITED(0.00)[rspamd.com]; ARC_NA(0.00)[]; RCPT_COUNT_TWELVE(0.00)[13]; TO_MATCH_ENVRCPT_ALL(0.00)[]; MIME_TRACE(0.00)[0:+]; SPAMHAUS_XBL(0.00)[2a07:de40:b281:104:10:150:64:97:from]; FREEMAIL_ENVRCPT(0.00)[gmail.com]; FREEMAIL_CC(0.00)[linux.dev,oracle.com,gmail.com,kvack.org,vger.kernel.org,lists.infradead.org,suse.cz]; RCVD_TLS_ALL(0.00)[]; RCVD_COUNT_TWO(0.00)[2]; FROM_EQ_ENVFROM(0.00)[]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; MID_RHS_MATCH_FROM(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; RECEIVED_SPAMHAUS_BLOCKED_OPENRESOLVER(0.00)[2a07:de40:b281:106:10:150:64:167:received]; DKIM_TRACE(0.00)[suse.cz:+]; R_RATELIMIT(0.00)[to_ip_from(RLfsjnp7neds983g95ihcnuzgq)]; DBL_BLOCKED_OPENRESOLVER(0.00)[oracle.com:email,suse.cz:dkim,suse.cz:mid,suse.cz:email,imap1.dmz-prg2.suse.org:rdns,imap1.dmz-prg2.suse.org:helo] X-Spam-Score: -4.51 The possibility of many barn operations is determined by the current number of full or empty sheaves. Taking the barn->lock just to find out that e.g. there are no empty sheaves results in unnecessary overhead and lock contention. Thus perform these checks outside of the lock with a data_race() annotated variable read and fail quickly without taking the lock. Checks for sheaf availability that racily succeed have to be obviously repeated under the lock for correctness, but we can skip repeating checks if there are too many sheaves on the given list as the limits don't need to be strict. Reviewed-by: Suren Baghdasaryan Reviewed-by: Harry Yoo Signed-off-by: Vlastimil Babka --- mm/slub.c | 27 ++++++++++++++++++++------- 1 file changed, 20 insertions(+), 7 deletions(-) diff --git a/mm/slub.c b/mm/slub.c index c6ca9b60acd15520410ac08d252bb09e111db6f1..d078630f7211b617f390a16fa08= 0da8a1df45355 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -2801,9 +2801,12 @@ static struct slab_sheaf *barn_get_empty_sheaf(struc= t node_barn *barn) struct slab_sheaf *empty =3D NULL; unsigned long flags; =20 + if (!data_race(barn->nr_empty)) + return NULL; + spin_lock_irqsave(&barn->lock, flags); =20 - if (barn->nr_empty) { + if (likely(barn->nr_empty)) { empty =3D list_first_entry(&barn->sheaves_empty, struct slab_sheaf, barn_list); list_del(&empty->barn_list); @@ -2850,6 +2853,9 @@ static struct slab_sheaf *barn_get_full_or_empty_shea= f(struct node_barn *barn) struct slab_sheaf *sheaf =3D NULL; unsigned long flags; =20 + if (!data_race(barn->nr_full) && !data_race(barn->nr_empty)) + return NULL; + spin_lock_irqsave(&barn->lock, flags); =20 if (barn->nr_full) { @@ -2880,9 +2886,12 @@ barn_replace_empty_sheaf(struct node_barn *barn, str= uct slab_sheaf *empty) struct slab_sheaf *full =3D NULL; unsigned long flags; =20 + if (!data_race(barn->nr_full)) + return NULL; + spin_lock_irqsave(&barn->lock, flags); =20 - if (barn->nr_full) { + if (likely(barn->nr_full)) { full =3D list_first_entry(&barn->sheaves_full, struct slab_sheaf, barn_list); list_del(&full->barn_list); @@ -2906,19 +2915,23 @@ barn_replace_full_sheaf(struct node_barn *barn, str= uct slab_sheaf *full) struct slab_sheaf *empty; unsigned long flags; =20 + /* we don't repeat this check under barn->lock as it's not critical */ + if (data_race(barn->nr_full) >=3D MAX_FULL_SHEAVES) + return ERR_PTR(-E2BIG); + if (!data_race(barn->nr_empty)) + return ERR_PTR(-ENOMEM); + spin_lock_irqsave(&barn->lock, flags); =20 - if (barn->nr_full >=3D MAX_FULL_SHEAVES) { - empty =3D ERR_PTR(-E2BIG); - } else if (!barn->nr_empty) { - empty =3D ERR_PTR(-ENOMEM); - } else { + if (likely(barn->nr_empty)) { empty =3D list_first_entry(&barn->sheaves_empty, struct slab_sheaf, barn_list); list_del(&empty->barn_list); list_add(&full->barn_list, &barn->sheaves_full); barn->nr_empty--; barn->nr_full++; + } else { + empty =3D ERR_PTR(-ENOMEM); } =20 spin_unlock_irqrestore(&barn->lock, flags); --=20 2.51.0 From nobody Fri Oct 3 07:42:42 2025 Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.223.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A3A543009EF for ; Wed, 3 Sep 2025 13:00:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=195.135.223.131 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1756904433; cv=none; b=pJjT5AvcQhLRHLRk9gMKDjkJ1EaYRwNuo0Aa16AaDo5HGQEeyVJd64waLiBh1l+RtKJhFg8KQsgEVuEN+vO9iebdfsdHYXmC7kUQTo5vpYGt0cQ9NDbnBtXbaycZ8RyVkx1QsNc3JHRq3aMdR5zOJrYxyLeTAGLWKOpQeJUPcR0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1756904433; c=relaxed/simple; bh=9P3w37uvK4hCtxYB1SWfj/3zCCYxV5YoKmWdRJi0fMI=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=Sm4KI9mXRdOyQIiG/24q0CDDbXWjS49Hl9t8IJNkyoHf6zm2q9kRTGOTk9LeJ+9oBVHalty1qNl4mceLwhMOLoy4mcLPzcYmKqa9sES1DzmO5G6Qn9qG6PQFV7725nzdbCPkRO+dcjvpJS8vrJwT3CNaafiGA16rdKEM27hK5vU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=suse.cz; spf=pass smtp.mailfrom=suse.cz; dkim=pass (1024-bit key) header.d=suse.cz header.i=@suse.cz header.b=UHokq7Id; dkim=permerror (0-bit key) header.d=suse.cz header.i=@suse.cz header.b=pmosW3Z5; dkim=pass (1024-bit key) header.d=suse.cz header.i=@suse.cz header.b=UHokq7Id; dkim=permerror (0-bit key) header.d=suse.cz header.i=@suse.cz header.b=pmosW3Z5; arc=none smtp.client-ip=195.135.223.131 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=suse.cz Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=suse.cz Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=suse.cz header.i=@suse.cz header.b="UHokq7Id"; dkim=permerror (0-bit key) header.d=suse.cz header.i=@suse.cz header.b="pmosW3Z5"; dkim=pass (1024-bit key) header.d=suse.cz header.i=@suse.cz header.b="UHokq7Id"; dkim=permerror (0-bit key) header.d=suse.cz header.i=@suse.cz header.b="pmosW3Z5" Received: from imap1.dmz-prg2.suse.org (imap1.dmz-prg2.suse.org [IPv6:2a07:de40:b281:104:10:150:64:97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 547E21F461; Wed, 3 Sep 2025 13:00:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1756904410; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=wINAnmJZsrmMDedeNn5PMbqYkUg3J6y9665h1Yp2O34=; b=UHokq7Idup178YOMFRbRTiP4XrmwipxKAOlMqzVkHx/ucsWcZ0adA7aM9Qz/9+dcZZH1jH YuzBVF+h3ODeXefcBjtYIm0nCGq8WP2SZDZi5xojADd7AOCzy5UwL5WpiLsZZN0+kQ96DQ aSRw2ptS66V4PY2kY6npMzJvH+piN0g= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1756904410; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=wINAnmJZsrmMDedeNn5PMbqYkUg3J6y9665h1Yp2O34=; b=pmosW3Z5p3aeVp5MFCU2sc+FAPShkKRa9OZpBEOCIdBZC3IpupKPpA/m48UjM8JttsxVoB 3wqUe5+0EEHZT3AA== Authentication-Results: smtp-out2.suse.de; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=UHokq7Id; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=pmosW3Z5 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1756904410; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=wINAnmJZsrmMDedeNn5PMbqYkUg3J6y9665h1Yp2O34=; b=UHokq7Idup178YOMFRbRTiP4XrmwipxKAOlMqzVkHx/ucsWcZ0adA7aM9Qz/9+dcZZH1jH YuzBVF+h3ODeXefcBjtYIm0nCGq8WP2SZDZi5xojADd7AOCzy5UwL5WpiLsZZN0+kQ96DQ aSRw2ptS66V4PY2kY6npMzJvH+piN0g= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1756904410; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=wINAnmJZsrmMDedeNn5PMbqYkUg3J6y9665h1Yp2O34=; b=pmosW3Z5p3aeVp5MFCU2sc+FAPShkKRa9OZpBEOCIdBZC3IpupKPpA/m48UjM8JttsxVoB 3wqUe5+0EEHZT3AA== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id 9EE2413AFF; Wed, 3 Sep 2025 13:00:09 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id kMuCJtk7uGitOAAAD6G6ig (envelope-from ); Wed, 03 Sep 2025 13:00:09 +0000 From: Vlastimil Babka Date: Wed, 03 Sep 2025 14:59:49 +0200 Subject: [PATCH v7 07/21] slab: skip percpu sheaves for remote object freeing Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20250903-slub-percpu-caches-v7-7-71c114cdefef@suse.cz> References: <20250903-slub-percpu-caches-v7-0-71c114cdefef@suse.cz> In-Reply-To: <20250903-slub-percpu-caches-v7-0-71c114cdefef@suse.cz> To: Suren Baghdasaryan , "Liam R. Howlett" , Christoph Lameter , David Rientjes Cc: Roman Gushchin , Harry Yoo , Uladzislau Rezki , Sidhartha Kumar , linux-mm@kvack.org, linux-kernel@vger.kernel.org, rcu@vger.kernel.org, maple-tree@lists.infradead.org, vbabka@suse.cz X-Mailer: b4 0.14.2 X-Spam-Level: X-Spam-Flag: NO X-Rspamd-Queue-Id: 547E21F461 X-Rspamd-Action: no action X-Rspamd-Server: rspamd1.dmz-prg2.suse.org X-Spamd-Result: default: False [-4.51 / 50.00]; BAYES_HAM(-3.00)[100.00%]; NEURAL_HAM_LONG(-1.00)[-1.000]; R_DKIM_ALLOW(-0.20)[suse.cz:s=susede2_rsa,suse.cz:s=susede2_ed25519]; NEURAL_HAM_SHORT(-0.20)[-1.000]; MIME_GOOD(-0.10)[text/plain]; MX_GOOD(-0.01)[]; DKIM_SIGNED(0.00)[suse.cz:s=susede2_rsa,suse.cz:s=susede2_ed25519]; RBL_SPAMHAUS_BLOCKED_OPENRESOLVER(0.00)[2a07:de40:b281:104:10:150:64:97:from]; FUZZY_RATELIMITED(0.00)[rspamd.com]; ARC_NA(0.00)[]; RCPT_COUNT_TWELVE(0.00)[13]; TO_MATCH_ENVRCPT_ALL(0.00)[]; MIME_TRACE(0.00)[0:+]; SPAMHAUS_XBL(0.00)[2a07:de40:b281:104:10:150:64:97:from]; FREEMAIL_ENVRCPT(0.00)[gmail.com]; FREEMAIL_CC(0.00)[linux.dev,oracle.com,gmail.com,kvack.org,vger.kernel.org,lists.infradead.org,suse.cz]; RCVD_TLS_ALL(0.00)[]; RCVD_COUNT_TWO(0.00)[2]; FROM_EQ_ENVFROM(0.00)[]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; MID_RHS_MATCH_FROM(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; RECEIVED_SPAMHAUS_BLOCKED_OPENRESOLVER(0.00)[2a07:de40:b281:106:10:150:64:167:received]; DKIM_TRACE(0.00)[suse.cz:+]; R_RATELIMIT(0.00)[to_ip_from(RLfsjnp7neds983g95ihcnuzgq)]; DBL_BLOCKED_OPENRESOLVER(0.00)[imap1.dmz-prg2.suse.org:rdns,imap1.dmz-prg2.suse.org:helo,suse.cz:dkim,suse.cz:mid,suse.cz:email,oracle.com:email] X-Spam-Score: -4.51 Since we don't control the NUMA locality of objects in percpu sheaves, allocations with node restrictions bypass them. Allocations without restrictions may however still expect to get local objects with high probability, and the introduction of sheaves can decrease it due to freed object from a remote node ending up in percpu sheaves. The fraction of such remote frees seems low (5% on an 8-node machine) but it can be expected that some cache or workload specific corner cases exist. We can either conclude that this is not a problem due to the low fraction, or we can make remote frees bypass percpu sheaves and go directly to their slabs. This will make the remote frees more expensive, but if if's only a small fraction, most frees will still benefit from the lower overhead of percpu sheaves. This patch thus makes remote object freeing bypass percpu sheaves, including bulk freeing, and kfree_rcu() via the rcu_free sheaf. However it's not intended to be 100% guarantee that percpu sheaves will only contain local objects. The refill from slabs does not provide that guarantee in the first place, and there might be cpu migrations happening when we need to unlock the local_lock. Avoiding all that could be possible but complicated so we can leave it for later investigation whether it would be worth it. It can be expected that the more selective freeing will itself prevent accumulation of remote objects in percpu sheaves so any such violations would have only short-term effects. Reviewed-by: Harry Yoo Signed-off-by: Vlastimil Babka --- mm/slab_common.c | 7 +++++-- mm/slub.c | 42 ++++++++++++++++++++++++++++++++++++------ 2 files changed, 41 insertions(+), 8 deletions(-) diff --git a/mm/slab_common.c b/mm/slab_common.c index 2d806e02568532a1000fd3912db6978e945dcfa8..08f5baee1309e5b5f10a22b8b3b= 0a09dfb314419 100644 --- a/mm/slab_common.c +++ b/mm/slab_common.c @@ -1623,8 +1623,11 @@ static bool kfree_rcu_sheaf(void *obj) =20 slab =3D folio_slab(folio); s =3D slab->slab_cache; - if (s->cpu_sheaves) - return __kfree_rcu_sheaf(s, obj); + if (s->cpu_sheaves) { + if (likely(!IS_ENABLED(CONFIG_NUMA) || + slab_nid(slab) =3D=3D numa_mem_id())) + return __kfree_rcu_sheaf(s, obj); + } =20 return false; } diff --git a/mm/slub.c b/mm/slub.c index d078630f7211b617f390a16fa080da8a1df45355..8e34d6afccf41efad0c6620126e= 09f874bdcb663 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -472,6 +472,7 @@ struct slab_sheaf { }; struct kmem_cache *cache; unsigned int size; + int node; /* only used for rcu_sheaf */ void *objects[]; }; =20 @@ -5756,7 +5757,7 @@ static void rcu_free_sheaf(struct rcu_head *head) */ __rcu_free_sheaf_prepare(s, sheaf); =20 - barn =3D get_node(s, numa_mem_id())->barn; + barn =3D get_node(s, sheaf->node)->barn; =20 /* due to slab_free_hook() */ if (unlikely(sheaf->size =3D=3D 0)) @@ -5842,10 +5843,12 @@ bool __kfree_rcu_sheaf(struct kmem_cache *s, void *= obj) =20 rcu_sheaf->objects[rcu_sheaf->size++] =3D obj; =20 - if (likely(rcu_sheaf->size < s->sheaf_capacity)) + if (likely(rcu_sheaf->size < s->sheaf_capacity)) { rcu_sheaf =3D NULL; - else + } else { pcs->rcu_free =3D NULL; + rcu_sheaf->node =3D numa_mem_id(); + } =20 local_unlock(&s->cpu_sheaves->lock); =20 @@ -5872,7 +5875,11 @@ static void free_to_pcs_bulk(struct kmem_cache *s, s= ize_t size, void **p) bool init =3D slab_want_init_on_free(s); unsigned int batch, i =3D 0; struct node_barn *barn; + void *remote_objects[PCS_BATCH_MAX]; + unsigned int remote_nr =3D 0; + int node =3D numa_mem_id(); =20 +next_remote_batch: while (i < size) { struct slab *slab =3D virt_to_slab(p[i]); =20 @@ -5882,7 +5889,15 @@ static void free_to_pcs_bulk(struct kmem_cache *s, s= ize_t size, void **p) if (unlikely(!slab_free_hook(s, p[i], init, false))) { p[i] =3D p[--size]; if (!size) - return; + goto flush_remote; + continue; + } + + if (unlikely(IS_ENABLED(CONFIG_NUMA) && slab_nid(slab) !=3D node)) { + remote_objects[remote_nr] =3D p[i]; + p[i] =3D p[--size]; + if (++remote_nr >=3D PCS_BATCH_MAX) + goto flush_remote; continue; } =20 @@ -5952,6 +5967,15 @@ static void free_to_pcs_bulk(struct kmem_cache *s, s= ize_t size, void **p) */ fallback: __kmem_cache_free_bulk(s, size, p); + +flush_remote: + if (remote_nr) { + __kmem_cache_free_bulk(s, remote_nr, &remote_objects[0]); + if (i < size) { + remote_nr =3D 0; + goto next_remote_batch; + } + } } =20 #ifndef CONFIG_SLUB_TINY @@ -6043,8 +6067,14 @@ void slab_free(struct kmem_cache *s, struct slab *sl= ab, void *object, if (unlikely(!slab_free_hook(s, object, slab_want_init_on_free(s), false)= )) return; =20 - if (!s->cpu_sheaves || !free_to_pcs(s, object)) - do_slab_free(s, slab, object, object, 1, addr); + if (s->cpu_sheaves && likely(!IS_ENABLED(CONFIG_NUMA) || + slab_nid(slab) =3D=3D numa_mem_id())) { + if (likely(free_to_pcs(s, object))) { + return; + } + } + + do_slab_free(s, slab, object, object, 1, addr); } =20 #ifdef CONFIG_MEMCG --=20 2.51.0 From nobody Fri Oct 3 07:42:42 2025 Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.223.130]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A47343009F5 for ; Wed, 3 Sep 2025 13:00:32 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=195.135.223.130 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1756904434; cv=none; b=H7Nf+TlTdcvw0CraCsbEq47wG1465juil2ReGER43LORzAizU34z+cTsQmEw0KJmnxKgmaIfcaNUEzSkCJ6RBXLmo3xgfgIOVMg0k0Vk7pFup1ovCU1Gmq6LiXNOKSGZdyhwqeC+d+G8OUCOMgFthOVccsATaehOxxIB3psSFlQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1756904434; c=relaxed/simple; bh=L7pepXxJCLwzgLMMIP7/yy4noBBnWbhzfBQn6BpgavY=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=iZ/mx7VRNBR3H34hhed+YzRwZ9uEvRJR3FYMcbsU55V/b+veiDz/oKQTZvgsrpxEyRgXoKYcaC7e4LDj9sFwC4J2g/tpjqEEQJYKFwko6WW6w2PpIMoE3aD97VGNuZpeU6RXEA+8cimDmwCotElmpyOig0NGaTTBxx/Iv92lXn8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=suse.cz; spf=pass smtp.mailfrom=suse.cz; dkim=pass (1024-bit key) header.d=suse.cz header.i=@suse.cz header.b=d815Ru4u; dkim=permerror (0-bit key) header.d=suse.cz header.i=@suse.cz header.b=Q4+GHUrg; dkim=pass (1024-bit key) header.d=suse.cz header.i=@suse.cz header.b=d815Ru4u; dkim=permerror (0-bit key) header.d=suse.cz header.i=@suse.cz header.b=Q4+GHUrg; arc=none smtp.client-ip=195.135.223.130 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=suse.cz Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=suse.cz Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=suse.cz header.i=@suse.cz header.b="d815Ru4u"; dkim=permerror (0-bit key) header.d=suse.cz header.i=@suse.cz header.b="Q4+GHUrg"; dkim=pass (1024-bit key) header.d=suse.cz header.i=@suse.cz header.b="d815Ru4u"; dkim=permerror (0-bit key) header.d=suse.cz header.i=@suse.cz header.b="Q4+GHUrg" Received: from imap1.dmz-prg2.suse.org (imap1.dmz-prg2.suse.org [IPv6:2a07:de40:b281:104:10:150:64:97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 7DF6C21228; Wed, 3 Sep 2025 13:00:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1756904410; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=0oWCjHlWJr99JqLNVmSxndIbZ+8rS7a9vseEqgaq/bM=; b=d815Ru4uv7FMVwbMjJB4RuhsWaLaTIIsXS/Enetu1ehbGaUOZHzOEk5zOBUWdiHqESNPAi IshDv4hWYz+IXGtkMQmmUFmy32avX5EjUoim4CR+byOlCeRa1Em2yAWaM3pUCcw84d1g6J igpy1jGPjXru50OKpo2wnapy3YyjZ5w= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1756904410; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=0oWCjHlWJr99JqLNVmSxndIbZ+8rS7a9vseEqgaq/bM=; b=Q4+GHUrgEiS5Zd6Q5xYp5H5wgz+ryOTaUM26qiCw7XbrlIXvK97mQmHS7/ulCABCLSR1Bj ijrJ8InGr+ML3NCA== Authentication-Results: smtp-out1.suse.de; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=d815Ru4u; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=Q4+GHUrg DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1756904410; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=0oWCjHlWJr99JqLNVmSxndIbZ+8rS7a9vseEqgaq/bM=; b=d815Ru4uv7FMVwbMjJB4RuhsWaLaTIIsXS/Enetu1ehbGaUOZHzOEk5zOBUWdiHqESNPAi IshDv4hWYz+IXGtkMQmmUFmy32avX5EjUoim4CR+byOlCeRa1Em2yAWaM3pUCcw84d1g6J igpy1jGPjXru50OKpo2wnapy3YyjZ5w= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1756904410; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=0oWCjHlWJr99JqLNVmSxndIbZ+8rS7a9vseEqgaq/bM=; b=Q4+GHUrgEiS5Zd6Q5xYp5H5wgz+ryOTaUM26qiCw7XbrlIXvK97mQmHS7/ulCABCLSR1Bj ijrJ8InGr+ML3NCA== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id B2EDC13B01; Wed, 3 Sep 2025 13:00:09 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id 4KBiK9k7uGitOAAAD6G6ig (envelope-from ); Wed, 03 Sep 2025 13:00:09 +0000 From: Vlastimil Babka Date: Wed, 03 Sep 2025 14:59:50 +0200 Subject: [PATCH v7 08/21] slab: allow NUMA restricted allocations to use percpu sheaves Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20250903-slub-percpu-caches-v7-8-71c114cdefef@suse.cz> References: <20250903-slub-percpu-caches-v7-0-71c114cdefef@suse.cz> In-Reply-To: <20250903-slub-percpu-caches-v7-0-71c114cdefef@suse.cz> To: Suren Baghdasaryan , "Liam R. Howlett" , Christoph Lameter , David Rientjes Cc: Roman Gushchin , Harry Yoo , Uladzislau Rezki , Sidhartha Kumar , linux-mm@kvack.org, linux-kernel@vger.kernel.org, rcu@vger.kernel.org, maple-tree@lists.infradead.org, vbabka@suse.cz X-Mailer: b4 0.14.2 X-Spamd-Result: default: False [-4.51 / 50.00]; BAYES_HAM(-3.00)[100.00%]; NEURAL_HAM_LONG(-1.00)[-1.000]; R_DKIM_ALLOW(-0.20)[suse.cz:s=susede2_rsa,suse.cz:s=susede2_ed25519]; NEURAL_HAM_SHORT(-0.20)[-1.000]; MIME_GOOD(-0.10)[text/plain]; MX_GOOD(-0.01)[]; DKIM_SIGNED(0.00)[suse.cz:s=susede2_rsa,suse.cz:s=susede2_ed25519]; RBL_SPAMHAUS_BLOCKED_OPENRESOLVER(0.00)[2a07:de40:b281:104:10:150:64:97:from]; FUZZY_RATELIMITED(0.00)[rspamd.com]; ARC_NA(0.00)[]; RCPT_COUNT_TWELVE(0.00)[13]; TO_MATCH_ENVRCPT_ALL(0.00)[]; MIME_TRACE(0.00)[0:+]; SPAMHAUS_XBL(0.00)[2a07:de40:b281:104:10:150:64:97:from]; FREEMAIL_ENVRCPT(0.00)[gmail.com]; FREEMAIL_CC(0.00)[linux.dev,oracle.com,gmail.com,kvack.org,vger.kernel.org,lists.infradead.org,suse.cz]; RCVD_TLS_ALL(0.00)[]; RCVD_COUNT_TWO(0.00)[2]; FROM_EQ_ENVFROM(0.00)[]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; MID_RHS_MATCH_FROM(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; RECEIVED_SPAMHAUS_BLOCKED_OPENRESOLVER(0.00)[2a07:de40:b281:106:10:150:64:167:received]; DKIM_TRACE(0.00)[suse.cz:+]; R_RATELIMIT(0.00)[to_ip_from(RLfsjnp7neds983g95ihcnuzgq)]; DBL_BLOCKED_OPENRESOLVER(0.00)[imap1.dmz-prg2.suse.org:helo,imap1.dmz-prg2.suse.org:rdns,oracle.com:email,suse.cz:mid,suse.cz:dkim,suse.cz:email] X-Spam-Flag: NO X-Spam-Level: X-Rspamd-Queue-Id: 7DF6C21228 X-Rspamd-Server: rspamd2.dmz-prg2.suse.org X-Rspamd-Action: no action X-Spam-Score: -4.51 Currently allocations asking for a specific node explicitly or via mempolicy in strict_numa node bypass percpu sheaves. Since sheaves contain mostly local objects, we can try allocating from them if the local node happens to be the requested node or allowed by the mempolicy. If we find the object from percpu sheaves is not from the expected node, we skip the sheaves - this should be rare. Reviewed-by: Harry Yoo Signed-off-by: Vlastimil Babka --- mm/slub.c | 53 ++++++++++++++++++++++++++++++++++++++++++++++------- 1 file changed, 46 insertions(+), 7 deletions(-) diff --git a/mm/slub.c b/mm/slub.c index 8e34d6afccf41efad0c6620126e09f874bdcb663..d3d054579a8af710a0cbf68fe96= 88565c84d3609 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -4816,18 +4816,43 @@ __pcs_replace_empty_main(struct kmem_cache *s, stru= ct slub_percpu_sheaves *pcs, } =20 static __fastpath_inline -void *alloc_from_pcs(struct kmem_cache *s, gfp_t gfp) +void *alloc_from_pcs(struct kmem_cache *s, gfp_t gfp, int node) { struct slub_percpu_sheaves *pcs; + bool node_requested; void *object; =20 #ifdef CONFIG_NUMA - if (static_branch_unlikely(&strict_numa)) { - if (current->mempolicy) - return NULL; + if (static_branch_unlikely(&strict_numa) && + node =3D=3D NUMA_NO_NODE) { + + struct mempolicy *mpol =3D current->mempolicy; + + if (mpol) { + /* + * Special BIND rule support. If the local node + * is in permitted set then do not redirect + * to a particular node. + * Otherwise we apply the memory policy to get + * the node we need to allocate on. + */ + if (mpol->mode !=3D MPOL_BIND || + !node_isset(numa_mem_id(), mpol->nodes)) + + node =3D mempolicy_slab_node(); + } } #endif =20 + node_requested =3D IS_ENABLED(CONFIG_NUMA) && node !=3D NUMA_NO_NODE; + + /* + * We assume the percpu sheaves contain only local objects although it's + * not completely guaranteed, so we verify later. + */ + if (unlikely(node_requested && node !=3D numa_mem_id())) + return NULL; + if (!local_trylock(&s->cpu_sheaves->lock)) return NULL; =20 @@ -4839,7 +4864,21 @@ void *alloc_from_pcs(struct kmem_cache *s, gfp_t gfp) return NULL; } =20 - object =3D pcs->main->objects[--pcs->main->size]; + object =3D pcs->main->objects[pcs->main->size - 1]; + + if (unlikely(node_requested)) { + /* + * Verify that the object was from the node we want. This could + * be false because of cpu migration during an unlocked part of + * the current allocation or previous freeing process. + */ + if (folio_nid(virt_to_folio(object)) !=3D node) { + local_unlock(&s->cpu_sheaves->lock); + return NULL; + } + } + + pcs->main->size--; =20 local_unlock(&s->cpu_sheaves->lock); =20 @@ -4939,8 +4978,8 @@ static __fastpath_inline void *slab_alloc_node(struct= kmem_cache *s, struct list if (unlikely(object)) goto out; =20 - if (s->cpu_sheaves && node =3D=3D NUMA_NO_NODE) - object =3D alloc_from_pcs(s, gfpflags); + if (s->cpu_sheaves) + object =3D alloc_from_pcs(s, gfpflags, node); =20 if (!object) object =3D __slab_alloc_node(s, gfpflags, node, addr, orig_size); --=20 2.51.0 From nobody Fri Oct 3 07:42:42 2025 Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.223.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C6CD6303C9D for ; Wed, 3 Sep 2025 13:00:59 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=195.135.223.131 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1756904461; cv=none; b=D+ErCWWctjF+88ltAjI4jdtahxBANVutB4D8LMUoe4MYgheS1hD1p9Tz8/In7xtCuBkjXcCPXGum2MCOaVyAdZHuWx+OefQbsWSBpG9frOUNYB6G3Dol52+gWNcPQdA92MuSdSRRNhYLhLVN5ACnkEkX7M2Ht9BNgBegTsvMYWs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1756904461; c=relaxed/simple; bh=SNS9mPclBWDQTLp0gzNlmhMRdVlMkCqS+zQlMmESaHw=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=De7jKFgvzdcQvLTDFNL2P3ATs44j/183/2SNfMYYsb+ekhxqWWPXjHy4aP1OLn5KudCkGZZ6x34QI+irMmpGT1HJ3dCDYa0onUS4zBQ8y4KbtkN8Sw2EwP46RAPcRmxs0X7NrbQYb0tsHJfsiwk7DAzT3ZFqNv27Cb86Df0NHLk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=suse.cz; spf=pass smtp.mailfrom=suse.cz; dkim=pass (1024-bit key) header.d=suse.cz header.i=@suse.cz header.b=omytXGUa; dkim=permerror (0-bit key) header.d=suse.cz header.i=@suse.cz header.b=/uLDg7Ai; dkim=pass (1024-bit key) header.d=suse.cz header.i=@suse.cz header.b=omytXGUa; dkim=permerror (0-bit key) header.d=suse.cz header.i=@suse.cz header.b=/uLDg7Ai; arc=none smtp.client-ip=195.135.223.131 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=suse.cz Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=suse.cz Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=suse.cz header.i=@suse.cz header.b="omytXGUa"; dkim=permerror (0-bit key) header.d=suse.cz header.i=@suse.cz header.b="/uLDg7Ai"; dkim=pass (1024-bit key) header.d=suse.cz header.i=@suse.cz header.b="omytXGUa"; dkim=permerror (0-bit key) header.d=suse.cz header.i=@suse.cz header.b="/uLDg7Ai" Received: from imap1.dmz-prg2.suse.org (unknown [10.150.64.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 7FF9B1F46E; Wed, 3 Sep 2025 13:00:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1756904410; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=04FJ6mdlY7gFitpo8K6WI1BZw3Xwifvtp3g2IfjcN5g=; b=omytXGUab1APokv8IgrliU7EEnaR7rjmj689H5mwTxRcyC+zLjitLaAQcuqSM2LSvMeUSx 6IhI/bsfGSdYNop32O25YDh5nnEOUk4MKjKPcF9s9Gx1fEQ83wc/8mCjZ70h1Q5NAQFyc1 VaRLZxprUijo2uRFllSgCyocBoC6jqU= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1756904410; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=04FJ6mdlY7gFitpo8K6WI1BZw3Xwifvtp3g2IfjcN5g=; b=/uLDg7Aimp53izI7djRGz4dVIz4AAglWuPzXAwP4GzKBPMdo6k5EC7X96oW4P/aSDpKlqS xsDJJe7apmqrTYAQ== Authentication-Results: smtp-out2.suse.de; none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1756904410; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=04FJ6mdlY7gFitpo8K6WI1BZw3Xwifvtp3g2IfjcN5g=; b=omytXGUab1APokv8IgrliU7EEnaR7rjmj689H5mwTxRcyC+zLjitLaAQcuqSM2LSvMeUSx 6IhI/bsfGSdYNop32O25YDh5nnEOUk4MKjKPcF9s9Gx1fEQ83wc/8mCjZ70h1Q5NAQFyc1 VaRLZxprUijo2uRFllSgCyocBoC6jqU= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1756904410; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=04FJ6mdlY7gFitpo8K6WI1BZw3Xwifvtp3g2IfjcN5g=; b=/uLDg7Aimp53izI7djRGz4dVIz4AAglWuPzXAwP4GzKBPMdo6k5EC7X96oW4P/aSDpKlqS xsDJJe7apmqrTYAQ== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id C7F6C13B02; Wed, 3 Sep 2025 13:00:09 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id 4PaJMNk7uGitOAAAD6G6ig (envelope-from ); Wed, 03 Sep 2025 13:00:09 +0000 From: Vlastimil Babka Date: Wed, 03 Sep 2025 14:59:51 +0200 Subject: [PATCH v7 09/21] tools/testing/maple_tree: Fix check_bulk_rebalance() locks Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20250903-slub-percpu-caches-v7-9-71c114cdefef@suse.cz> References: <20250903-slub-percpu-caches-v7-0-71c114cdefef@suse.cz> In-Reply-To: <20250903-slub-percpu-caches-v7-0-71c114cdefef@suse.cz> To: Suren Baghdasaryan , "Liam R. Howlett" , Christoph Lameter , David Rientjes Cc: Roman Gushchin , Harry Yoo , Uladzislau Rezki , Sidhartha Kumar , linux-mm@kvack.org, linux-kernel@vger.kernel.org, rcu@vger.kernel.org, maple-tree@lists.infradead.org, vbabka@suse.cz X-Mailer: b4 0.14.2 X-Spamd-Result: default: False [-4.30 / 50.00]; BAYES_HAM(-3.00)[100.00%]; NEURAL_HAM_LONG(-1.00)[-1.000]; NEURAL_HAM_SHORT(-0.20)[-0.999]; MIME_GOOD(-0.10)[text/plain]; RCVD_VIA_SMTP_AUTH(0.00)[]; ARC_NA(0.00)[]; FREEMAIL_ENVRCPT(0.00)[gmail.com]; MIME_TRACE(0.00)[0:+]; FUZZY_RATELIMITED(0.00)[rspamd.com]; TO_DN_SOME(0.00)[]; RCPT_COUNT_TWELVE(0.00)[13]; RCVD_TLS_ALL(0.00)[]; MID_RHS_MATCH_FROM(0.00)[]; DKIM_SIGNED(0.00)[suse.cz:s=susede2_rsa,suse.cz:s=susede2_ed25519]; FROM_HAS_DN(0.00)[]; FREEMAIL_CC(0.00)[linux.dev,oracle.com,gmail.com,kvack.org,vger.kernel.org,lists.infradead.org,suse.cz]; R_RATELIMIT(0.00)[to_ip_from(RLwn5r54y1cp81no5tmbbew5oc)]; FROM_EQ_ENVFROM(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; RCVD_COUNT_TWO(0.00)[2]; DBL_BLOCKED_OPENRESOLVER(0.00)[suse.cz:mid,suse.cz:email,oracle.com:email,imap1.dmz-prg2.suse.org:helo] X-Spam-Flag: NO X-Spam-Level: X-Spam-Score: -4.30 From: "Liam R. Howlett" The check_bulk_rebalance() test was not correctly locking the tree which caused issues with the sheaves testing in later patches. Adding the missing locks fixed the issue. Fixes: a6e0ceb7bf48 ("maple_tree: check for MA_STATE_BULK on setting wr_reb= alance") Signed-off-by: Liam R. Howlett Reviewed-by: Sidhartha Kumar Signed-off-by: Vlastimil Babka --- tools/testing/radix-tree/maple.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/tools/testing/radix-tree/maple.c b/tools/testing/radix-tree/ma= ple.c index 172700fb7784d29f9403003b4484a5ebd7aa316b..159d5307b30a4b37e6cf2941848= b8718e1b891d9 100644 --- a/tools/testing/radix-tree/maple.c +++ b/tools/testing/radix-tree/maple.c @@ -36465,6 +36465,7 @@ static inline void check_bulk_rebalance(struct mapl= e_tree *mt) =20 build_full_tree(mt, 0, 2); =20 + mas_lock(&mas); /* erase every entry in the tree */ do { /* set up bulk store mode */ @@ -36474,6 +36475,7 @@ static inline void check_bulk_rebalance(struct mapl= e_tree *mt) } while (mas_prev(&mas, 0) !=3D NULL); =20 mas_destroy(&mas); + mas_unlock(&mas); } =20 void farmer_tests(void) --=20 2.51.0 From nobody Fri Oct 3 07:42:42 2025 Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.223.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B40D430101D for ; Wed, 3 Sep 2025 13:00:40 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=195.135.223.131 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1756904442; cv=none; b=ZWRqxwKDFpwDka3Rk7aLPeydDimDaYeeQ7TaUeaPlgodLaQQtQJnAGrmaEvk80AlJXbqjitvGssWGFFxbHR6rRlI5/OyFESkukogW0reEuTYXbI8rJNbXPJobAFioRpEz4lsWz5jLQT88gFwCQ5YKZvByCd7LFZ9B4IoZ+e+IEI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1756904442; c=relaxed/simple; bh=isKMO/oEHRqTdYv8cphz4X4oJvUf3JlTf8u3MNzeRuY=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=Pnr71KSvFMd97qkJoxn+fpIZu3qzHperyy+Yw/Lf+acTHRR7BagqOEN1lDFfOwzRD4UJlIKe124Z8UwLnyxtcnbHKDDIXw0TnX5ZkjbMnHu7vEI5WfSP1mRPZVCHZcOywvZhW0jmqQCIpubhjlsxt/hEnVvljFzZW2VRWAm3f5s= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=suse.cz; spf=pass smtp.mailfrom=suse.cz; dkim=pass (1024-bit key) header.d=suse.cz header.i=@suse.cz header.b=xmlnP3KJ; dkim=permerror (0-bit key) header.d=suse.cz header.i=@suse.cz header.b=N4++Wte7; dkim=pass (1024-bit key) header.d=suse.cz header.i=@suse.cz header.b=xmlnP3KJ; dkim=permerror (0-bit key) header.d=suse.cz header.i=@suse.cz header.b=N4++Wte7; arc=none smtp.client-ip=195.135.223.131 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=suse.cz Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=suse.cz Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=suse.cz header.i=@suse.cz header.b="xmlnP3KJ"; dkim=permerror (0-bit key) header.d=suse.cz header.i=@suse.cz header.b="N4++Wte7"; dkim=pass (1024-bit key) header.d=suse.cz header.i=@suse.cz header.b="xmlnP3KJ"; dkim=permerror (0-bit key) header.d=suse.cz header.i=@suse.cz header.b="N4++Wte7" Received: from imap1.dmz-prg2.suse.org (imap1.dmz-prg2.suse.org [IPv6:2a07:de40:b281:104:10:150:64:97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id AEC901F745; Wed, 3 Sep 2025 13:00:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1756904410; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Y8VowGBFfDXG/GD85vj37Bupo+CslYBKF/PIHnrFTIo=; b=xmlnP3KJzd9cpQ2KM9zzpkXW/WInltLBBIOx1WWqZstmEQ8zlGEsKFADxRv2DWWqhVy+g7 bH6hbWOVsz1cZLixljP1tJxgbpWYrX16+YFgN6UjjsXNaHGrhSrVHTYdM+v2zVQvWkGRf1 2WBXlMwwIxcQZds6glBZ4JTZmCtNwBo= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1756904410; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Y8VowGBFfDXG/GD85vj37Bupo+CslYBKF/PIHnrFTIo=; b=N4++Wte7jeEZOXvrC75+VMaUkacqmOM3kfAzUKtfPmsg95/u4HsVUJSInm3u2ssLDRl2z/ ZvUNnf7oZJOVtVBg== Authentication-Results: smtp-out2.suse.de; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=xmlnP3KJ; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=N4++Wte7 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1756904410; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Y8VowGBFfDXG/GD85vj37Bupo+CslYBKF/PIHnrFTIo=; b=xmlnP3KJzd9cpQ2KM9zzpkXW/WInltLBBIOx1WWqZstmEQ8zlGEsKFADxRv2DWWqhVy+g7 bH6hbWOVsz1cZLixljP1tJxgbpWYrX16+YFgN6UjjsXNaHGrhSrVHTYdM+v2zVQvWkGRf1 2WBXlMwwIxcQZds6glBZ4JTZmCtNwBo= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1756904410; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Y8VowGBFfDXG/GD85vj37Bupo+CslYBKF/PIHnrFTIo=; b=N4++Wte7jeEZOXvrC75+VMaUkacqmOM3kfAzUKtfPmsg95/u4HsVUJSInm3u2ssLDRl2z/ ZvUNnf7oZJOVtVBg== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id DB42D13B03; Wed, 3 Sep 2025 13:00:09 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id +FNHNdk7uGitOAAAD6G6ig (envelope-from ); Wed, 03 Sep 2025 13:00:09 +0000 From: Vlastimil Babka Date: Wed, 03 Sep 2025 14:59:52 +0200 Subject: [PATCH v7 10/21] tools/testing/vma: Implement vm_refcnt reset Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20250903-slub-percpu-caches-v7-10-71c114cdefef@suse.cz> References: <20250903-slub-percpu-caches-v7-0-71c114cdefef@suse.cz> In-Reply-To: <20250903-slub-percpu-caches-v7-0-71c114cdefef@suse.cz> To: Suren Baghdasaryan , "Liam R. Howlett" , Christoph Lameter , David Rientjes Cc: Roman Gushchin , Harry Yoo , Uladzislau Rezki , Sidhartha Kumar , linux-mm@kvack.org, linux-kernel@vger.kernel.org, rcu@vger.kernel.org, maple-tree@lists.infradead.org, vbabka@suse.cz X-Mailer: b4 0.14.2 X-Spam-Level: X-Spam-Flag: NO X-Rspamd-Queue-Id: AEC901F745 X-Rspamd-Action: no action X-Rspamd-Server: rspamd1.dmz-prg2.suse.org X-Spamd-Result: default: False [-4.51 / 50.00]; BAYES_HAM(-3.00)[100.00%]; NEURAL_HAM_LONG(-1.00)[-1.000]; R_DKIM_ALLOW(-0.20)[suse.cz:s=susede2_rsa,suse.cz:s=susede2_ed25519]; NEURAL_HAM_SHORT(-0.20)[-1.000]; MIME_GOOD(-0.10)[text/plain]; MX_GOOD(-0.01)[]; DKIM_SIGNED(0.00)[suse.cz:s=susede2_rsa,suse.cz:s=susede2_ed25519]; RBL_SPAMHAUS_BLOCKED_OPENRESOLVER(0.00)[2a07:de40:b281:104:10:150:64:97:from]; FUZZY_RATELIMITED(0.00)[rspamd.com]; ARC_NA(0.00)[]; RCPT_COUNT_TWELVE(0.00)[13]; TO_MATCH_ENVRCPT_ALL(0.00)[]; MIME_TRACE(0.00)[0:+]; SPAMHAUS_XBL(0.00)[2a07:de40:b281:104:10:150:64:97:from]; FREEMAIL_ENVRCPT(0.00)[gmail.com]; FREEMAIL_CC(0.00)[linux.dev,oracle.com,gmail.com,kvack.org,vger.kernel.org,lists.infradead.org,suse.cz]; RCVD_TLS_ALL(0.00)[]; RCVD_COUNT_TWO(0.00)[2]; FROM_EQ_ENVFROM(0.00)[]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; MID_RHS_MATCH_FROM(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; RECEIVED_SPAMHAUS_BLOCKED_OPENRESOLVER(0.00)[2a07:de40:b281:106:10:150:64:167:received]; DKIM_TRACE(0.00)[suse.cz:+]; R_RATELIMIT(0.00)[to_ip_from(RLfsjnp7neds983g95ihcnuzgq)]; DBL_BLOCKED_OPENRESOLVER(0.00)[imap1.dmz-prg2.suse.org:rdns,imap1.dmz-prg2.suse.org:helo,suse.cz:dkim,suse.cz:mid,suse.cz:email,oracle.com:email] X-Spam-Score: -4.51 From: "Liam R. Howlett" Add the reset of the ref count in vma_lock_init(). This is needed if the vma memory is not zeroed on allocation. Signed-off-by: Liam R. Howlett Signed-off-by: Vlastimil Babka --- tools/testing/vma/vma_internal.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/tools/testing/vma/vma_internal.h b/tools/testing/vma/vma_inter= nal.h index 3639aa8dd2b06ebe5b9cfcfe6669994fd38c482d..2f1c586400cfd65413d2ff82199= 590b43796b18c 100644 --- a/tools/testing/vma/vma_internal.h +++ b/tools/testing/vma/vma_internal.h @@ -1418,8 +1418,8 @@ static inline void ksm_exit(struct mm_struct *mm) =20 static inline void vma_lock_init(struct vm_area_struct *vma, bool reset_re= fcnt) { - (void)vma; - (void)reset_refcnt; + if (reset_refcnt) + refcount_set(&vma->vm_refcnt, 0); } =20 static inline void vma_numab_state_init(struct vm_area_struct *vma) --=20 2.51.0 From nobody Fri Oct 3 07:42:42 2025 Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.223.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8F027302CA3 for ; Wed, 3 Sep 2025 13:00:49 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=195.135.223.131 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1756904452; cv=none; b=D/ImEiBpvmeqqAAOjn0Aud/ynDfB9c8EhVm2WhHaaU9VdikrRuD9GrDvztqml+HWlt5bZ9hve5zUSncFl3QuPncAk1ShmYbhP00qOE9c4tHIpeIDxxoCfaUwTpT9F1jaK/INhnGUiWc+Mdv+dJ2LhFzG8ol5QZ+PqxHPHtVeMo4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1756904452; c=relaxed/simple; bh=QHehESsOeA3vi5qXStkkhOtYkTZzN2ia7vHC02Dcz/g=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=g7RcjeTkX/mZm2g0/T8jgpGgPgxfYo9mM9nm2qw3NSWRyFLS6x2gNxS7H2iJI6X5QJi1PsvkUSz9/wQcBOBig3yJx4Z0eyLb9uAkCvRGUW0XELtgycLV61RlXfpUAwSNagkp5DMACWrkT+3aIstszQD8QmiH+9yPpRs8l+9oHYY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=suse.cz; spf=pass smtp.mailfrom=suse.cz; dkim=pass (1024-bit key) header.d=suse.cz header.i=@suse.cz header.b=1rC+6R4/; dkim=permerror (0-bit key) header.d=suse.cz header.i=@suse.cz header.b=sGfYSGgv; dkim=pass (1024-bit key) header.d=suse.cz header.i=@suse.cz header.b=1rC+6R4/; dkim=permerror (0-bit key) header.d=suse.cz header.i=@suse.cz header.b=sGfYSGgv; arc=none smtp.client-ip=195.135.223.131 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=suse.cz Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=suse.cz Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=suse.cz header.i=@suse.cz header.b="1rC+6R4/"; dkim=permerror (0-bit key) header.d=suse.cz header.i=@suse.cz header.b="sGfYSGgv"; dkim=pass (1024-bit key) header.d=suse.cz header.i=@suse.cz header.b="1rC+6R4/"; dkim=permerror (0-bit key) header.d=suse.cz header.i=@suse.cz header.b="sGfYSGgv" Received: from imap1.dmz-prg2.suse.org (unknown [10.150.64.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id A5D691F6E6; Wed, 3 Sep 2025 13:00:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1756904410; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=/GJ/0Wx5QlG8JUnl44VCEh2zTPKqkIPoz/HEUfq/O7s=; b=1rC+6R4/UzSv0rkdSlCfIa7Eoeu0lG3zpKHDAhymb8aKOBkwIR/UOHat/o8wp6i/QxrZrD 1YU+AGxQ1RWk3gwZFv6wsiUwGkc08siPxPtmi+kXfV2wome66MVEMUZSPZQVoT57gminoL ODLxXLJwToVND2oh5kD/aTyV1DBaRTw= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1756904410; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=/GJ/0Wx5QlG8JUnl44VCEh2zTPKqkIPoz/HEUfq/O7s=; b=sGfYSGgv2ij1HpOHi0G+iG8LsnAE/Z4PBBXfzvdxCBXrmMoMkqNJhxJU4OAcRudAX2RDIy zvuoQZWaxQro17Bg== Authentication-Results: smtp-out2.suse.de; none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1756904410; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=/GJ/0Wx5QlG8JUnl44VCEh2zTPKqkIPoz/HEUfq/O7s=; b=1rC+6R4/UzSv0rkdSlCfIa7Eoeu0lG3zpKHDAhymb8aKOBkwIR/UOHat/o8wp6i/QxrZrD 1YU+AGxQ1RWk3gwZFv6wsiUwGkc08siPxPtmi+kXfV2wome66MVEMUZSPZQVoT57gminoL ODLxXLJwToVND2oh5kD/aTyV1DBaRTw= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1756904410; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=/GJ/0Wx5QlG8JUnl44VCEh2zTPKqkIPoz/HEUfq/O7s=; b=sGfYSGgv2ij1HpOHi0G+iG8LsnAE/Z4PBBXfzvdxCBXrmMoMkqNJhxJU4OAcRudAX2RDIy zvuoQZWaxQro17Bg== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id EE9BA13B04; Wed, 3 Sep 2025 13:00:09 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id iDgAOtk7uGitOAAAD6G6ig (envelope-from ); Wed, 03 Sep 2025 13:00:09 +0000 From: Vlastimil Babka Date: Wed, 03 Sep 2025 14:59:53 +0200 Subject: [PATCH v7 11/21] tools/testing: Add support for changes to slab for sheaves Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20250903-slub-percpu-caches-v7-11-71c114cdefef@suse.cz> References: <20250903-slub-percpu-caches-v7-0-71c114cdefef@suse.cz> In-Reply-To: <20250903-slub-percpu-caches-v7-0-71c114cdefef@suse.cz> To: Suren Baghdasaryan , "Liam R. Howlett" , Christoph Lameter , David Rientjes Cc: Roman Gushchin , Harry Yoo , Uladzislau Rezki , Sidhartha Kumar , linux-mm@kvack.org, linux-kernel@vger.kernel.org, rcu@vger.kernel.org, maple-tree@lists.infradead.org, vbabka@suse.cz, "Liam R. Howlett" X-Mailer: b4 0.14.2 X-Spamd-Result: default: False [-4.30 / 50.00]; BAYES_HAM(-3.00)[100.00%]; NEURAL_HAM_LONG(-1.00)[-1.000]; NEURAL_HAM_SHORT(-0.20)[-0.999]; MIME_GOOD(-0.10)[text/plain]; RCVD_VIA_SMTP_AUTH(0.00)[]; ARC_NA(0.00)[]; FREEMAIL_ENVRCPT(0.00)[gmail.com]; MIME_TRACE(0.00)[0:+]; FUZZY_RATELIMITED(0.00)[rspamd.com]; TO_DN_SOME(0.00)[]; RCPT_COUNT_TWELVE(0.00)[14]; RCVD_TLS_ALL(0.00)[]; MID_RHS_MATCH_FROM(0.00)[]; DKIM_SIGNED(0.00)[suse.cz:s=susede2_rsa,suse.cz:s=susede2_ed25519]; FROM_HAS_DN(0.00)[]; FREEMAIL_CC(0.00)[linux.dev,oracle.com,gmail.com,kvack.org,vger.kernel.org,lists.infradead.org,suse.cz,Oracle.com]; R_RATELIMIT(0.00)[to_ip_from(RLwn5r54y1cp81no5tmbbew5oc)]; FROM_EQ_ENVFROM(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; RCVD_COUNT_TWO(0.00)[2]; DBL_BLOCKED_OPENRESOLVER(0.00)[suse.cz:mid,suse.cz:email,oracle.com:email,imap1.dmz-prg2.suse.org:helo] X-Spam-Flag: NO X-Spam-Level: X-Spam-Score: -4.30 From: "Liam R. Howlett" The slab changes for sheaves requires more effort in the testing code. Unite all the kmem_cache work into the tools/include slab header for both the vma and maple tree testing. The vma test code also requires importing more #defines to allow for seamless use of the shared kmem_cache code. This adds the pthread header to the slab header in the tools directory to allow for the pthread_mutex in linux.c. Signed-off-by: Liam R. Howlett Signed-off-by: Vlastimil Babka --- tools/include/linux/slab.h | 137 ++++++++++++++++++++++++++++++++++= ++-- tools/testing/shared/linux.c | 26 ++------ tools/testing/shared/maple-shim.c | 1 + tools/testing/vma/vma_internal.h | 94 +------------------------- 4 files changed, 142 insertions(+), 116 deletions(-) diff --git a/tools/include/linux/slab.h b/tools/include/linux/slab.h index c87051e2b26f5a7fee0362697fae067076b8e84d..c5c5cc6db5668be2cc94c29065c= cfa7ca7b4bb08 100644 --- a/tools/include/linux/slab.h +++ b/tools/include/linux/slab.h @@ -4,11 +4,31 @@ =20 #include #include +#include =20 -#define SLAB_PANIC 2 #define SLAB_RECLAIM_ACCOUNT 0x00020000UL /* Objects are rec= laimable */ =20 #define kzalloc_node(size, flags, node) kmalloc(size, flags) +enum _slab_flag_bits { + _SLAB_KMALLOC, + _SLAB_HWCACHE_ALIGN, + _SLAB_PANIC, + _SLAB_TYPESAFE_BY_RCU, + _SLAB_ACCOUNT, + _SLAB_FLAGS_LAST_BIT +}; + +#define __SLAB_FLAG_BIT(nr) ((unsigned int __force)(1U << (nr))) +#define __SLAB_FLAG_UNUSED ((unsigned int __force)(0U)) + +#define SLAB_HWCACHE_ALIGN __SLAB_FLAG_BIT(_SLAB_HWCACHE_ALIGN) +#define SLAB_PANIC __SLAB_FLAG_BIT(_SLAB_PANIC) +#define SLAB_TYPESAFE_BY_RCU __SLAB_FLAG_BIT(_SLAB_TYPESAFE_BY_RCU) +#ifdef CONFIG_MEMCG +# define SLAB_ACCOUNT __SLAB_FLAG_BIT(_SLAB_ACCOUNT) +#else +# define SLAB_ACCOUNT __SLAB_FLAG_UNUSED +#endif =20 void *kmalloc(size_t size, gfp_t gfp); void kfree(void *p); @@ -23,6 +43,86 @@ enum slab_state { FULL }; =20 +struct kmem_cache { + pthread_mutex_t lock; + unsigned int size; + unsigned int align; + unsigned int sheaf_capacity; + int nr_objs; + void *objs; + void (*ctor)(void *); + bool non_kernel_enabled; + unsigned int non_kernel; + unsigned long nr_allocated; + unsigned long nr_tallocated; + bool exec_callback; + void (*callback)(void *); + void *private; +}; + +struct kmem_cache_args { + /** + * @align: The required alignment for the objects. + * + * %0 means no specific alignment is requested. + */ + unsigned int align; + /** + * @sheaf_capacity: The maximum size of the sheaf. + */ + unsigned int sheaf_capacity; + /** + * @useroffset: Usercopy region offset. + * + * %0 is a valid offset, when @usersize is non-%0 + */ + unsigned int useroffset; + /** + * @usersize: Usercopy region size. + * + * %0 means no usercopy region is specified. + */ + unsigned int usersize; + /** + * @freeptr_offset: Custom offset for the free pointer + * in &SLAB_TYPESAFE_BY_RCU caches + * + * By default &SLAB_TYPESAFE_BY_RCU caches place the free pointer + * outside of the object. This might cause the object to grow in size. + * Cache creators that have a reason to avoid this can specify a custom + * free pointer offset in their struct where the free pointer will be + * placed. + * + * Note that placing the free pointer inside the object requires the + * caller to ensure that no fields are invalidated that are required to + * guard against object recycling (See &SLAB_TYPESAFE_BY_RCU for + * details). + * + * Using %0 as a value for @freeptr_offset is valid. If @freeptr_offset + * is specified, %use_freeptr_offset must be set %true. + * + * Note that @ctor currently isn't supported with custom free pointers + * as a @ctor requires an external free pointer. + */ + unsigned int freeptr_offset; + /** + * @use_freeptr_offset: Whether a @freeptr_offset is used. + */ + bool use_freeptr_offset; + /** + * @ctor: A constructor for the objects. + * + * The constructor is invoked for each object in a newly allocated slab + * page. It is the cache user's responsibility to free object in the + * same state as after calling the constructor, or deal appropriately + * with any differences between a freshly constructed and a reallocated + * object. + * + * %NULL means no constructor. + */ + void (*ctor)(void *); +}; + static inline void *kzalloc(size_t size, gfp_t gfp) { return kmalloc(size, gfp | __GFP_ZERO); @@ -37,9 +137,38 @@ static inline void *kmem_cache_alloc(struct kmem_cache = *cachep, int flags) } void kmem_cache_free(struct kmem_cache *cachep, void *objp); =20 -struct kmem_cache *kmem_cache_create(const char *name, unsigned int size, - unsigned int align, unsigned int flags, - void (*ctor)(void *)); + +struct kmem_cache * +__kmem_cache_create_args(const char *name, unsigned int size, + struct kmem_cache_args *args, unsigned int flags); + +/* If NULL is passed for @args, use this variant with default arguments. */ +static inline struct kmem_cache * +__kmem_cache_default_args(const char *name, unsigned int size, + struct kmem_cache_args *args, unsigned int flags) +{ + struct kmem_cache_args kmem_default_args =3D {}; + + return __kmem_cache_create_args(name, size, &kmem_default_args, flags); +} + +static inline struct kmem_cache * +__kmem_cache_create(const char *name, unsigned int size, unsigned int alig= n, + unsigned int flags, void (*ctor)(void *)) +{ + struct kmem_cache_args kmem_args =3D { + .align =3D align, + .ctor =3D ctor, + }; + + return __kmem_cache_create_args(name, size, &kmem_args, flags); +} + +#define kmem_cache_create(__name, __object_size, __args, ...) \ + _Generic((__args), \ + struct kmem_cache_args *: __kmem_cache_create_args, \ + void *: __kmem_cache_default_args, \ + default: __kmem_cache_create)(__name, __object_size, __args, __VA_ARGS__) =20 void kmem_cache_free_bulk(struct kmem_cache *cachep, size_t size, void **l= ist); int kmem_cache_alloc_bulk(struct kmem_cache *cachep, gfp_t gfp, size_t siz= e, diff --git a/tools/testing/shared/linux.c b/tools/testing/shared/linux.c index 0f97fb0d19e19c327aa4843a35b45cc086f4f366..97b8412ccbb6d222604c7b397c5= 3c65618d8d51b 100644 --- a/tools/testing/shared/linux.c +++ b/tools/testing/shared/linux.c @@ -16,21 +16,6 @@ int nr_allocated; int preempt_count; int test_verbose; =20 -struct kmem_cache { - pthread_mutex_t lock; - unsigned int size; - unsigned int align; - int nr_objs; - void *objs; - void (*ctor)(void *); - unsigned int non_kernel; - unsigned long nr_allocated; - unsigned long nr_tallocated; - bool exec_callback; - void (*callback)(void *); - void *private; -}; - void kmem_cache_set_callback(struct kmem_cache *cachep, void (*callback)(v= oid *)) { cachep->callback =3D callback; @@ -234,23 +219,26 @@ int kmem_cache_alloc_bulk(struct kmem_cache *cachep, = gfp_t gfp, size_t size, } =20 struct kmem_cache * -kmem_cache_create(const char *name, unsigned int size, unsigned int align, - unsigned int flags, void (*ctor)(void *)) +__kmem_cache_create_args(const char *name, unsigned int size, + struct kmem_cache_args *args, + unsigned int flags) { struct kmem_cache *ret =3D malloc(sizeof(*ret)); =20 pthread_mutex_init(&ret->lock, NULL); ret->size =3D size; - ret->align =3D align; + ret->align =3D args->align; + ret->sheaf_capacity =3D args->sheaf_capacity; ret->nr_objs =3D 0; ret->nr_allocated =3D 0; ret->nr_tallocated =3D 0; ret->objs =3D NULL; - ret->ctor =3D ctor; + ret->ctor =3D args->ctor; ret->non_kernel =3D 0; ret->exec_callback =3D false; ret->callback =3D NULL; ret->private =3D NULL; + return ret; } =20 diff --git a/tools/testing/shared/maple-shim.c b/tools/testing/shared/maple= -shim.c index 640df76f483e09f3b6f85612786060dd273e2362..9d7b743415660305416e972fa75= b56824211b0eb 100644 --- a/tools/testing/shared/maple-shim.c +++ b/tools/testing/shared/maple-shim.c @@ -3,5 +3,6 @@ /* Very simple shim around the maple tree. */ =20 #include "maple-shared.h" +#include =20 #include "../../../lib/maple_tree.c" diff --git a/tools/testing/vma/vma_internal.h b/tools/testing/vma/vma_inter= nal.h index 2f1c586400cfd65413d2ff82199590b43796b18c..972ab2686e0a3654cef611ce9f3= 409bc0c38dc80 100644 --- a/tools/testing/vma/vma_internal.h +++ b/tools/testing/vma/vma_internal.h @@ -26,6 +26,7 @@ #include #include #include +#include =20 extern unsigned long stack_guard_gap; #ifdef CONFIG_MMU @@ -509,65 +510,6 @@ struct pagetable_move_control { .len_in =3D len_, \ } =20 -struct kmem_cache_args { - /** - * @align: The required alignment for the objects. - * - * %0 means no specific alignment is requested. - */ - unsigned int align; - /** - * @useroffset: Usercopy region offset. - * - * %0 is a valid offset, when @usersize is non-%0 - */ - unsigned int useroffset; - /** - * @usersize: Usercopy region size. - * - * %0 means no usercopy region is specified. - */ - unsigned int usersize; - /** - * @freeptr_offset: Custom offset for the free pointer - * in &SLAB_TYPESAFE_BY_RCU caches - * - * By default &SLAB_TYPESAFE_BY_RCU caches place the free pointer - * outside of the object. This might cause the object to grow in size. - * Cache creators that have a reason to avoid this can specify a custom - * free pointer offset in their struct where the free pointer will be - * placed. - * - * Note that placing the free pointer inside the object requires the - * caller to ensure that no fields are invalidated that are required to - * guard against object recycling (See &SLAB_TYPESAFE_BY_RCU for - * details). - * - * Using %0 as a value for @freeptr_offset is valid. If @freeptr_offset - * is specified, %use_freeptr_offset must be set %true. - * - * Note that @ctor currently isn't supported with custom free pointers - * as a @ctor requires an external free pointer. - */ - unsigned int freeptr_offset; - /** - * @use_freeptr_offset: Whether a @freeptr_offset is used. - */ - bool use_freeptr_offset; - /** - * @ctor: A constructor for the objects. - * - * The constructor is invoked for each object in a newly allocated slab - * page. It is the cache user's responsibility to free object in the - * same state as after calling the constructor, or deal appropriately - * with any differences between a freshly constructed and a reallocated - * object. - * - * %NULL means no constructor. - */ - void (*ctor)(void *); -}; - static inline void vma_iter_invalidate(struct vma_iterator *vmi) { mas_pause(&vmi->mas); @@ -652,40 +594,6 @@ static inline void vma_init(struct vm_area_struct *vma= , struct mm_struct *mm) vma->vm_lock_seq =3D UINT_MAX; } =20 -struct kmem_cache { - const char *name; - size_t object_size; - struct kmem_cache_args *args; -}; - -static inline struct kmem_cache *__kmem_cache_create(const char *name, - size_t object_size, - struct kmem_cache_args *args) -{ - struct kmem_cache *ret =3D malloc(sizeof(struct kmem_cache)); - - ret->name =3D name; - ret->object_size =3D object_size; - ret->args =3D args; - - return ret; -} - -#define kmem_cache_create(__name, __object_size, __args, ...) \ - __kmem_cache_create((__name), (__object_size), (__args)) - -static inline void *kmem_cache_alloc(struct kmem_cache *s, gfp_t gfpflags) -{ - (void)gfpflags; - - return calloc(s->object_size, 1); -} - -static inline void kmem_cache_free(struct kmem_cache *s, void *x) -{ - free(x); -} - /* * These are defined in vma.h, but sadly vm_stat_account() is referenced by * kernel/fork.c, so we have to these broadly available there, and tempora= rily --=20 2.51.0 From nobody Fri Oct 3 07:42:42 2025 Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.223.130]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 91306304BD0 for ; Wed, 3 Sep 2025 13:01:10 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=195.135.223.130 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1756904472; cv=none; b=lr8OZz3wVQGo45c38nngyR4MfVEp2974XeCO5VFfVSDQm+BlpKlU94cCI8gNX+ej+p881om4YfYsRrbrWw8Z7siG6H6ZiQ2x9w6Cptkg3J28CAXNfGJdQ0mes3wODEVZG2bSsBNKzfGRSAEH0z16dIvtJMWNM15saCkEiu9l4OA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1756904472; c=relaxed/simple; bh=zXygsHQ8M75ldUrzHX4v4b361hHPjjuRMUX5R7ROHQM=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=tGLvxAdFwrGmVx1Z/7vOkeeadtMZI+Uw/u7klNeaLElCinilxOqTazwvRZapXeNlw1FxsEHZn4XRdZ8BSKAQlloYeiF4huOyvT7lNooX1evboAMQBMeY1SLi3+e2+dvVptf+BJkxU2jMIVlS41fsNx7HK3ZxdL70Rh6l1FC0Nog= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=suse.cz; spf=pass smtp.mailfrom=suse.cz; dkim=pass (1024-bit key) header.d=suse.cz header.i=@suse.cz header.b=Aoc32Y8K; dkim=permerror (0-bit key) header.d=suse.cz header.i=@suse.cz header.b=OcxShp2L; dkim=pass (1024-bit key) header.d=suse.cz header.i=@suse.cz header.b=Aoc32Y8K; dkim=permerror (0-bit key) header.d=suse.cz header.i=@suse.cz header.b=OcxShp2L; arc=none smtp.client-ip=195.135.223.130 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=suse.cz Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=suse.cz Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=suse.cz header.i=@suse.cz header.b="Aoc32Y8K"; dkim=permerror (0-bit key) header.d=suse.cz header.i=@suse.cz header.b="OcxShp2L"; dkim=pass (1024-bit key) header.d=suse.cz header.i=@suse.cz header.b="Aoc32Y8K"; dkim=permerror (0-bit key) header.d=suse.cz header.i=@suse.cz header.b="OcxShp2L" Received: from imap1.dmz-prg2.suse.org (unknown [10.150.64.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id C8C432122A; Wed, 3 Sep 2025 13:00:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1756904410; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=GkJLUdth35DaRyhM3TWCLrM77sXSdG1lPON3otYkBww=; b=Aoc32Y8K6z/D1jmhEZ0JziBmWP8l/NTAykVCv7X+avgX9sLROwv0cL8g6wou27QGhylw7v d/dK3sDQAs8zGnhBNuwlWmrptUZnGqvAocyQ/u6G3JLk1jwg/zkB/vyBexwHZDDWmsSbsf aDsIHwrULTDpOEyitOtk/62ADIkIfJQ= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1756904410; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=GkJLUdth35DaRyhM3TWCLrM77sXSdG1lPON3otYkBww=; b=OcxShp2LaA80zjS5jwgpwcQZbrYHyDrvFVkiNAtp5srKA8xy23EfU9tBfRv0hPzRSXL5GZ x0Xk7Eqz/O/FhwDg== Authentication-Results: smtp-out1.suse.de; none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1756904410; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=GkJLUdth35DaRyhM3TWCLrM77sXSdG1lPON3otYkBww=; b=Aoc32Y8K6z/D1jmhEZ0JziBmWP8l/NTAykVCv7X+avgX9sLROwv0cL8g6wou27QGhylw7v d/dK3sDQAs8zGnhBNuwlWmrptUZnGqvAocyQ/u6G3JLk1jwg/zkB/vyBexwHZDDWmsSbsf aDsIHwrULTDpOEyitOtk/62ADIkIfJQ= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1756904410; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=GkJLUdth35DaRyhM3TWCLrM77sXSdG1lPON3otYkBww=; b=OcxShp2LaA80zjS5jwgpwcQZbrYHyDrvFVkiNAtp5srKA8xy23EfU9tBfRv0hPzRSXL5GZ x0Xk7Eqz/O/FhwDg== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id 0FE9613B05; Wed, 3 Sep 2025 13:00:10 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id GLKgA9o7uGitOAAAD6G6ig (envelope-from ); Wed, 03 Sep 2025 13:00:10 +0000 From: Vlastimil Babka Date: Wed, 03 Sep 2025 14:59:54 +0200 Subject: [PATCH v7 12/21] mm, vma: use percpu sheaves for vm_area_struct cache Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20250903-slub-percpu-caches-v7-12-71c114cdefef@suse.cz> References: <20250903-slub-percpu-caches-v7-0-71c114cdefef@suse.cz> In-Reply-To: <20250903-slub-percpu-caches-v7-0-71c114cdefef@suse.cz> To: Suren Baghdasaryan , "Liam R. Howlett" , Christoph Lameter , David Rientjes Cc: Roman Gushchin , Harry Yoo , Uladzislau Rezki , Sidhartha Kumar , linux-mm@kvack.org, linux-kernel@vger.kernel.org, rcu@vger.kernel.org, maple-tree@lists.infradead.org, vbabka@suse.cz X-Mailer: b4 0.14.2 X-Spamd-Result: default: False [-4.30 / 50.00]; BAYES_HAM(-3.00)[99.99%]; NEURAL_HAM_LONG(-1.00)[-1.000]; NEURAL_HAM_SHORT(-0.20)[-0.999]; MIME_GOOD(-0.10)[text/plain]; RCVD_VIA_SMTP_AUTH(0.00)[]; ARC_NA(0.00)[]; FREEMAIL_ENVRCPT(0.00)[gmail.com]; MIME_TRACE(0.00)[0:+]; FUZZY_RATELIMITED(0.00)[rspamd.com]; TO_DN_SOME(0.00)[]; RCPT_COUNT_TWELVE(0.00)[13]; RCVD_TLS_ALL(0.00)[]; MID_RHS_MATCH_FROM(0.00)[]; DKIM_SIGNED(0.00)[suse.cz:s=susede2_rsa,suse.cz:s=susede2_ed25519]; FROM_HAS_DN(0.00)[]; FREEMAIL_CC(0.00)[linux.dev,oracle.com,gmail.com,kvack.org,vger.kernel.org,lists.infradead.org,suse.cz]; R_RATELIMIT(0.00)[to_ip_from(RLwn5r54y1cp81no5tmbbew5oc)]; FROM_EQ_ENVFROM(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; RCVD_COUNT_TWO(0.00)[2]; DBL_BLOCKED_OPENRESOLVER(0.00)[suse.cz:mid,suse.cz:email,imap1.dmz-prg2.suse.org:helo] X-Spam-Flag: NO X-Spam-Level: X-Spam-Score: -4.30 Create the vm_area_struct cache with percpu sheaves of size 32 to improve its performance. Reviewed-by: Suren Baghdasaryan Signed-off-by: Vlastimil Babka --- mm/vma_init.c | 1 + 1 file changed, 1 insertion(+) diff --git a/mm/vma_init.c b/mm/vma_init.c index 8e53c7943561e7324e7992946b4065dec1149b82..52c6b55fac4519e0da39ca75ad0= 18e14449d1d95 100644 --- a/mm/vma_init.c +++ b/mm/vma_init.c @@ -16,6 +16,7 @@ void __init vma_state_init(void) struct kmem_cache_args args =3D { .use_freeptr_offset =3D true, .freeptr_offset =3D offsetof(struct vm_area_struct, vm_freeptr), + .sheaf_capacity =3D 32, }; =20 vm_area_cachep =3D kmem_cache_create("vm_area_struct", --=20 2.51.0 From nobody Fri Oct 3 07:42:42 2025 Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.223.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 93D6C3043DC for ; Wed, 3 Sep 2025 13:01:08 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=195.135.223.131 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1756904470; cv=none; b=dssqhPhVQNAYm6G284Yp+JxVfnH3fKDzjOqoaqdxX3+AzjFIl1ev2ZNpCnOH4pXuI/whPt/XhKlTkk7HFlhI9X8GUeSaCLTJRdEEIpLHhqaSjo+b6X/8BGtlEvMJgQ8G07uLw+wvZr2U2W/5ye/TAJVH2iljrhROcKaNeX/rCc8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1756904470; c=relaxed/simple; bh=OcjndUE8gaQOpDzBjlFgvhR58KrlDl0vH/GNTgWNlFc=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=MuQMCliJedZbyvjcfzTwr+GJbqbmNdJLVLGlTbMS41QQkmYd3D7xuJQRQuY4U7fRYBUHDFtNdp350dkW4apZKRQ2opWGD184MAQ7E3EDsYVSm6jre1+gCkVIc7HQ/d22e5S6sHnn6leaLvdC1QBqq6sfvMBEV5Q+Sm059YCOD3Q= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=suse.cz; spf=pass smtp.mailfrom=suse.cz; dkim=pass (1024-bit key) header.d=suse.cz header.i=@suse.cz header.b=g/crOFhD; dkim=permerror (0-bit key) header.d=suse.cz header.i=@suse.cz header.b=+sVatJLy; dkim=pass (1024-bit key) header.d=suse.cz header.i=@suse.cz header.b=g/crOFhD; dkim=permerror (0-bit key) header.d=suse.cz header.i=@suse.cz header.b=+sVatJLy; arc=none smtp.client-ip=195.135.223.131 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=suse.cz Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=suse.cz Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=suse.cz header.i=@suse.cz header.b="g/crOFhD"; dkim=permerror (0-bit key) header.d=suse.cz header.i=@suse.cz header.b="+sVatJLy"; dkim=pass (1024-bit key) header.d=suse.cz header.i=@suse.cz header.b="g/crOFhD"; dkim=permerror (0-bit key) header.d=suse.cz header.i=@suse.cz header.b="+sVatJLy" Received: from imap1.dmz-prg2.suse.org (unknown [10.150.64.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id CDB631F747; Wed, 3 Sep 2025 13:00:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1756904410; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=oVt3cZeiI6R/GA6oApyVXQ+OhlrdalPHGhvrN43p2wQ=; b=g/crOFhD4na0C8rkDfdx8cUVpPT9jg4mfQfUrEEG4AJVIvPjBSHqVuQMjx53jybGO8wQv1 DWgjWKKW0nkBALpU4bUzjCI3zEvlwoWuUoE/GApq97pTKowgtq9dqLM59Kh0Ov/9ySkgTY CFgYJ/oUj4EsO0+gbZKWXeBV7c/nkaU= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1756904410; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=oVt3cZeiI6R/GA6oApyVXQ+OhlrdalPHGhvrN43p2wQ=; b=+sVatJLyO7r62pCDRCB73fRH3sZeAriIa9SwQJzFj/GY9Z4x+DyRR3hpk8Usdnb5CGo5as RkaYUntMqWg2EMAw== Authentication-Results: smtp-out2.suse.de; none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1756904410; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=oVt3cZeiI6R/GA6oApyVXQ+OhlrdalPHGhvrN43p2wQ=; b=g/crOFhD4na0C8rkDfdx8cUVpPT9jg4mfQfUrEEG4AJVIvPjBSHqVuQMjx53jybGO8wQv1 DWgjWKKW0nkBALpU4bUzjCI3zEvlwoWuUoE/GApq97pTKowgtq9dqLM59Kh0Ov/9ySkgTY CFgYJ/oUj4EsO0+gbZKWXeBV7c/nkaU= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1756904410; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=oVt3cZeiI6R/GA6oApyVXQ+OhlrdalPHGhvrN43p2wQ=; b=+sVatJLyO7r62pCDRCB73fRH3sZeAriIa9SwQJzFj/GY9Z4x+DyRR3hpk8Usdnb5CGo5as RkaYUntMqWg2EMAw== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id 29A3313B06; Wed, 3 Sep 2025 13:00:10 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id GEjhCdo7uGitOAAAD6G6ig (envelope-from ); Wed, 03 Sep 2025 13:00:10 +0000 From: Vlastimil Babka Date: Wed, 03 Sep 2025 14:59:55 +0200 Subject: [PATCH v7 13/21] maple_tree: use percpu sheaves for maple_node_cache Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20250903-slub-percpu-caches-v7-13-71c114cdefef@suse.cz> References: <20250903-slub-percpu-caches-v7-0-71c114cdefef@suse.cz> In-Reply-To: <20250903-slub-percpu-caches-v7-0-71c114cdefef@suse.cz> To: Suren Baghdasaryan , "Liam R. Howlett" , Christoph Lameter , David Rientjes Cc: Roman Gushchin , Harry Yoo , Uladzislau Rezki , Sidhartha Kumar , linux-mm@kvack.org, linux-kernel@vger.kernel.org, rcu@vger.kernel.org, maple-tree@lists.infradead.org, vbabka@suse.cz X-Mailer: b4 0.14.2 X-Spam-Level: X-Spamd-Result: default: False [-4.30 / 50.00]; BAYES_HAM(-3.00)[100.00%]; NEURAL_HAM_LONG(-1.00)[-1.000]; NEURAL_HAM_SHORT(-0.20)[-0.999]; MIME_GOOD(-0.10)[text/plain]; RCVD_VIA_SMTP_AUTH(0.00)[]; ARC_NA(0.00)[]; FREEMAIL_ENVRCPT(0.00)[gmail.com]; MIME_TRACE(0.00)[0:+]; FUZZY_RATELIMITED(0.00)[rspamd.com]; TO_DN_SOME(0.00)[]; RCPT_COUNT_TWELVE(0.00)[13]; RCVD_TLS_ALL(0.00)[]; MID_RHS_MATCH_FROM(0.00)[]; DKIM_SIGNED(0.00)[suse.cz:s=susede2_rsa,suse.cz:s=susede2_ed25519]; FROM_HAS_DN(0.00)[]; FREEMAIL_CC(0.00)[linux.dev,oracle.com,gmail.com,kvack.org,vger.kernel.org,lists.infradead.org,suse.cz]; R_RATELIMIT(0.00)[to_ip_from(RLwn5r54y1cp81no5tmbbew5oc)]; FROM_EQ_ENVFROM(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; RCVD_COUNT_TWO(0.00)[2]; DBL_BLOCKED_OPENRESOLVER(0.00)[imap1.dmz-prg2.suse.org:helo,suse.cz:email,suse.cz:mid,oracle.com:email] X-Spam-Flag: NO X-Spam-Score: -4.30 Setup the maple_node_cache with percpu sheaves of size 32 to hopefully improve its performance. Note this will not immediately take advantage of sheaf batching of kfree_rcu() operations due to the maple tree using call_rcu with custom callbacks. The followup changes to maple tree will change that and also make use of the prefilled sheaves functionality. Signed-off-by: Vlastimil Babka Reviewed-by: Sidhartha Kumar Reviewed-by: Suren Baghdasaryan --- lib/maple_tree.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/lib/maple_tree.c b/lib/maple_tree.c index b4ee2d29d7a962ca374467d0533185f2db3d35ff..a0db6bdc63793b8bbd544e24639= 1d99e880dede3 100644 --- a/lib/maple_tree.c +++ b/lib/maple_tree.c @@ -6302,9 +6302,14 @@ bool mas_nomem(struct ma_state *mas, gfp_t gfp) =20 void __init maple_tree_init(void) { + struct kmem_cache_args args =3D { + .align =3D sizeof(struct maple_node), + .sheaf_capacity =3D 32, + }; + maple_node_cache =3D kmem_cache_create("maple_node", - sizeof(struct maple_node), sizeof(struct maple_node), - SLAB_PANIC, NULL); + sizeof(struct maple_node), &args, + SLAB_PANIC); } =20 /** --=20 2.51.0 From nobody Fri Oct 3 07:42:42 2025 Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.223.130]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BF4973074B3 for ; Wed, 3 Sep 2025 13:01:19 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=195.135.223.130 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1756904481; cv=none; b=ETM9kYxiEfiRlxOQQLmFI9h4A71WCeN0mDJ+2TMMOhxbY4tAdZ21QW/ZMwYhmr4sRGy8jVv26qJ7DK1k/xtLwMZZ2l1wTiEioC9VN3JNk3Xbe2oA+0wOgTsJroV0bPinxvoKv2XQiWVYEJGdL8EmY28Q4dLqr91hAOlwUNxceKY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1756904481; c=relaxed/simple; bh=7O7Hz1Mts4IMOrx+TCnRt2ePdSfzexl/nOsXo91gEXw=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=M4Urg97b32lEQ1suZNrXZo+oDKB5m2XAupQl1h9N/VlX9kGsKON46i0D/cS0cVW6rFuVc7rU5tGYEaXGPGnSPjvwpFDGTdUTC/pxnoTbqckHxkMIlZa2lcbwhLjwc0d64E9zJwEvkmWYLezGS9CJDG7/Tk68Cgqao8N5cE4/k0g= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=suse.cz; spf=pass smtp.mailfrom=suse.cz; dkim=pass (1024-bit key) header.d=suse.cz header.i=@suse.cz header.b=UpE5aYpx; dkim=permerror (0-bit key) header.d=suse.cz header.i=@suse.cz header.b=yG8ZP+1F; dkim=pass (1024-bit key) header.d=suse.cz header.i=@suse.cz header.b=UpE5aYpx; dkim=permerror (0-bit key) header.d=suse.cz header.i=@suse.cz header.b=yG8ZP+1F; arc=none smtp.client-ip=195.135.223.130 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=suse.cz Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=suse.cz Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=suse.cz header.i=@suse.cz header.b="UpE5aYpx"; dkim=permerror (0-bit key) header.d=suse.cz header.i=@suse.cz header.b="yG8ZP+1F"; dkim=pass (1024-bit key) header.d=suse.cz header.i=@suse.cz header.b="UpE5aYpx"; dkim=permerror (0-bit key) header.d=suse.cz header.i=@suse.cz header.b="yG8ZP+1F" Received: from imap1.dmz-prg2.suse.org (imap1.dmz-prg2.suse.org [IPv6:2a07:de40:b281:104:10:150:64:97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id D4CB52122B; Wed, 3 Sep 2025 13:00:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1756904410; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=sL+qCDJGk4s8FKjAhsD7S3UX8bYd+2rVoRHmfBRAuDU=; b=UpE5aYpxnGAKCt1S0hrkBGoX0L860Cndg8O68gwLbljC5cpsm/uC7eKulYMEDJBOkwSt2e 53g0APA9BXG6d149lzRi8S4oovaympDBz+GmQgHUqH4qkHbBotvbKmp05sxz6ERVZxUlJA 0kZ/p2X06oWqU7MOOQIsmqhzAQ4oCkY= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1756904410; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=sL+qCDJGk4s8FKjAhsD7S3UX8bYd+2rVoRHmfBRAuDU=; b=yG8ZP+1Fyl9iMw/IHXxgYX7BycvqdDO7RoSq2Beb8b1hmkpVGljEWuqK/WkC+IDrWxnFM+ 7najWcdnMm7jqWCg== Authentication-Results: smtp-out1.suse.de; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=UpE5aYpx; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=yG8ZP+1F DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1756904410; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=sL+qCDJGk4s8FKjAhsD7S3UX8bYd+2rVoRHmfBRAuDU=; b=UpE5aYpxnGAKCt1S0hrkBGoX0L860Cndg8O68gwLbljC5cpsm/uC7eKulYMEDJBOkwSt2e 53g0APA9BXG6d149lzRi8S4oovaympDBz+GmQgHUqH4qkHbBotvbKmp05sxz6ERVZxUlJA 0kZ/p2X06oWqU7MOOQIsmqhzAQ4oCkY= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1756904410; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=sL+qCDJGk4s8FKjAhsD7S3UX8bYd+2rVoRHmfBRAuDU=; b=yG8ZP+1Fyl9iMw/IHXxgYX7BycvqdDO7RoSq2Beb8b1hmkpVGljEWuqK/WkC+IDrWxnFM+ 7najWcdnMm7jqWCg== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id 3D9A913B07; Wed, 3 Sep 2025 13:00:10 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id 8BjFDto7uGitOAAAD6G6ig (envelope-from ); Wed, 03 Sep 2025 13:00:10 +0000 From: Vlastimil Babka Date: Wed, 03 Sep 2025 14:59:56 +0200 Subject: [PATCH v7 14/21] tools/testing: include maple-shim.c in maple.c Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20250903-slub-percpu-caches-v7-14-71c114cdefef@suse.cz> References: <20250903-slub-percpu-caches-v7-0-71c114cdefef@suse.cz> In-Reply-To: <20250903-slub-percpu-caches-v7-0-71c114cdefef@suse.cz> To: Suren Baghdasaryan , "Liam R. Howlett" , Christoph Lameter , David Rientjes Cc: Roman Gushchin , Harry Yoo , Uladzislau Rezki , Sidhartha Kumar , linux-mm@kvack.org, linux-kernel@vger.kernel.org, rcu@vger.kernel.org, maple-tree@lists.infradead.org, vbabka@suse.cz X-Mailer: b4 0.14.2 X-Spamd-Result: default: False [-4.51 / 50.00]; BAYES_HAM(-3.00)[100.00%]; NEURAL_HAM_LONG(-1.00)[-1.000]; R_DKIM_ALLOW(-0.20)[suse.cz:s=susede2_rsa,suse.cz:s=susede2_ed25519]; NEURAL_HAM_SHORT(-0.20)[-1.000]; MIME_GOOD(-0.10)[text/plain]; MX_GOOD(-0.01)[]; DKIM_SIGNED(0.00)[suse.cz:s=susede2_rsa,suse.cz:s=susede2_ed25519]; RBL_SPAMHAUS_BLOCKED_OPENRESOLVER(0.00)[2a07:de40:b281:104:10:150:64:97:from]; FUZZY_RATELIMITED(0.00)[rspamd.com]; ARC_NA(0.00)[]; RCPT_COUNT_TWELVE(0.00)[13]; TO_MATCH_ENVRCPT_ALL(0.00)[]; MIME_TRACE(0.00)[0:+]; SPAMHAUS_XBL(0.00)[2a07:de40:b281:104:10:150:64:97:from]; FREEMAIL_ENVRCPT(0.00)[gmail.com]; FREEMAIL_CC(0.00)[linux.dev,oracle.com,gmail.com,kvack.org,vger.kernel.org,lists.infradead.org,suse.cz]; RCVD_TLS_ALL(0.00)[]; RCVD_COUNT_TWO(0.00)[2]; FROM_EQ_ENVFROM(0.00)[]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; MID_RHS_MATCH_FROM(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; RECEIVED_SPAMHAUS_BLOCKED_OPENRESOLVER(0.00)[2a07:de40:b281:106:10:150:64:167:received]; DKIM_TRACE(0.00)[suse.cz:+]; R_RATELIMIT(0.00)[to_ip_from(RLfsjnp7neds983g95ihcnuzgq)]; DBL_BLOCKED_OPENRESOLVER(0.00)[oracle.com:email,suse.cz:mid,suse.cz:dkim,suse.cz:email,imap1.dmz-prg2.suse.org:helo,imap1.dmz-prg2.suse.org:rdns] X-Spam-Flag: NO X-Spam-Level: X-Rspamd-Queue-Id: D4CB52122B X-Rspamd-Server: rspamd2.dmz-prg2.suse.org X-Rspamd-Action: no action X-Spam-Score: -4.51 There's some duplicated code and we are about to add more functionality in maple-shared.h that we will need in the userspace maple test to be available, so include it via maple-shim.c Co-developed-by: Liam R. Howlett Signed-off-by: Liam R. Howlett Signed-off-by: Vlastimil Babka --- tools/testing/radix-tree/maple.c | 12 +++--------- 1 file changed, 3 insertions(+), 9 deletions(-) diff --git a/tools/testing/radix-tree/maple.c b/tools/testing/radix-tree/ma= ple.c index 159d5307b30a4b37e6cf2941848b8718e1b891d9..7fe91f24849b35723ec6aadbe45= ec7d2abedcc11 100644 --- a/tools/testing/radix-tree/maple.c +++ b/tools/testing/radix-tree/maple.c @@ -8,14 +8,6 @@ * difficult to handle in kernel tests. */ =20 -#define CONFIG_DEBUG_MAPLE_TREE -#define CONFIG_MAPLE_SEARCH -#define MAPLE_32BIT (MAPLE_NODE_SLOTS > 31) -#include "test.h" -#include -#include -#include - #define module_init(x) #define module_exit(x) #define MODULE_AUTHOR(x) @@ -23,7 +15,9 @@ #define MODULE_LICENSE(x) #define dump_stack() assert(0) =20 -#include "../../../lib/maple_tree.c" +#include "test.h" + +#include "../shared/maple-shim.c" #include "../../../lib/test_maple_tree.c" =20 #define RCU_RANGE_COUNT 1000 --=20 2.51.0 From nobody Fri Oct 3 07:42:42 2025 Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.223.130]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 747B63081AE for ; Wed, 3 Sep 2025 13:01:29 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=195.135.223.130 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1756904491; cv=none; b=YZ8z5rplTZqdj6kTjqNAQ/QrIqME8H3LTZCAp/CR3QZF1XcAHnpdadxJXomLpm5WlmpjuD0ADL6YhP6eXD9d94iYVMgYmCSJ/4+Ag2iNxDF4liNMIINAeQArsWJtuHue8pq+wMCWTEz1yCRAfSAwMFu8shM8P9YfMHU8B61Ij9c= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1756904491; c=relaxed/simple; bh=8zhwdOkNAvH6snya/EkLPN9kkVJJEEo+XjxQ5KsrJ7Y=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=fNhrUWLfZ/BCc2WcazrhJ6KZgJvdHvvy0IHynOpFyaX2/nOJIPL+4Ds/TOacuI1rlH8/132F9fsWdwSC8LfwCN/114vYbtwZGl4IF6aF268JWS6dtRoIupva52wCG8OU2dkOdhQ6wAW5aykHBccPTUcvc0pS6t9L/7E6k3BHdrI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=suse.cz; spf=pass smtp.mailfrom=suse.cz; dkim=pass (1024-bit key) header.d=suse.cz header.i=@suse.cz header.b=s56PP7bb; dkim=permerror (0-bit key) header.d=suse.cz header.i=@suse.cz header.b=+A8t13hY; dkim=pass (1024-bit key) header.d=suse.cz header.i=@suse.cz header.b=s56PP7bb; dkim=permerror (0-bit key) header.d=suse.cz header.i=@suse.cz header.b=+A8t13hY; arc=none smtp.client-ip=195.135.223.130 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=suse.cz Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=suse.cz Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=suse.cz header.i=@suse.cz header.b="s56PP7bb"; dkim=permerror (0-bit key) header.d=suse.cz header.i=@suse.cz header.b="+A8t13hY"; dkim=pass (1024-bit key) header.d=suse.cz header.i=@suse.cz header.b="s56PP7bb"; dkim=permerror (0-bit key) header.d=suse.cz header.i=@suse.cz header.b="+A8t13hY" Received: from imap1.dmz-prg2.suse.org (unknown [10.150.64.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id D69712122E; Wed, 3 Sep 2025 13:00:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1756904410; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=JknvsOxfmdGK99z5Y7Du3/KRq6UYdu6X7lKlwYsXCV8=; b=s56PP7bbwHH6whwJL2r5Ho4P507+nSBUP2TENIZjQ6Q81MdfcfZ+M39yPd3egI2mWi4ogm Jaki9Wif5jE8+sZXfWqHbpXIFEkFdLYgcv1lsFftAUPUZfDhTZgWSskrY+yMG6HBRby2+k RgqBBotTah24Rlbnn6UIvj2NRP8aY9o= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1756904410; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=JknvsOxfmdGK99z5Y7Du3/KRq6UYdu6X7lKlwYsXCV8=; b=+A8t13hYRyz2t1rAcREe70GXhSQ1k6mOaEitFo2To8F3ZjbHzc2x3A1u867+o4CmmebmsJ JXpRflqZLF43kqCw== Authentication-Results: smtp-out1.suse.de; none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1756904410; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=JknvsOxfmdGK99z5Y7Du3/KRq6UYdu6X7lKlwYsXCV8=; b=s56PP7bbwHH6whwJL2r5Ho4P507+nSBUP2TENIZjQ6Q81MdfcfZ+M39yPd3egI2mWi4ogm Jaki9Wif5jE8+sZXfWqHbpXIFEkFdLYgcv1lsFftAUPUZfDhTZgWSskrY+yMG6HBRby2+k RgqBBotTah24Rlbnn6UIvj2NRP8aY9o= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1756904410; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=JknvsOxfmdGK99z5Y7Du3/KRq6UYdu6X7lKlwYsXCV8=; b=+A8t13hYRyz2t1rAcREe70GXhSQ1k6mOaEitFo2To8F3ZjbHzc2x3A1u867+o4CmmebmsJ JXpRflqZLF43kqCw== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id 518B113ACB; Wed, 3 Sep 2025 13:00:10 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id wEShE9o7uGitOAAAD6G6ig (envelope-from ); Wed, 03 Sep 2025 13:00:10 +0000 From: Vlastimil Babka Date: Wed, 03 Sep 2025 14:59:57 +0200 Subject: [PATCH v7 15/21] testing/radix-tree/maple: Hack around kfree_rcu not existing Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20250903-slub-percpu-caches-v7-15-71c114cdefef@suse.cz> References: <20250903-slub-percpu-caches-v7-0-71c114cdefef@suse.cz> In-Reply-To: <20250903-slub-percpu-caches-v7-0-71c114cdefef@suse.cz> To: Suren Baghdasaryan , "Liam R. Howlett" , Christoph Lameter , David Rientjes Cc: Roman Gushchin , Harry Yoo , Uladzislau Rezki , Sidhartha Kumar , linux-mm@kvack.org, linux-kernel@vger.kernel.org, rcu@vger.kernel.org, maple-tree@lists.infradead.org, vbabka@suse.cz, Pedro Falcato X-Mailer: b4 0.14.2 X-Spam-Level: X-Spamd-Result: default: False [-4.30 / 50.00]; BAYES_HAM(-3.00)[100.00%]; NEURAL_HAM_LONG(-1.00)[-1.000]; NEURAL_HAM_SHORT(-0.20)[-1.000]; MIME_GOOD(-0.10)[text/plain]; RCVD_VIA_SMTP_AUTH(0.00)[]; ARC_NA(0.00)[]; FREEMAIL_ENVRCPT(0.00)[gmail.com]; MIME_TRACE(0.00)[0:+]; FUZZY_RATELIMITED(0.00)[rspamd.com]; TO_DN_SOME(0.00)[]; RCPT_COUNT_TWELVE(0.00)[14]; RCVD_TLS_ALL(0.00)[]; MID_RHS_MATCH_FROM(0.00)[]; DKIM_SIGNED(0.00)[suse.cz:s=susede2_rsa,suse.cz:s=susede2_ed25519]; FROM_HAS_DN(0.00)[]; FREEMAIL_CC(0.00)[linux.dev,oracle.com,gmail.com,kvack.org,vger.kernel.org,lists.infradead.org,suse.cz,suse.de]; R_RATELIMIT(0.00)[to_ip_from(RLwn5r54y1cp81no5tmbbew5oc)]; FROM_EQ_ENVFROM(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; RCVD_COUNT_TWO(0.00)[2]; DBL_BLOCKED_OPENRESOLVER(0.00)[suse.de:email,oracle.com:email,imap1.dmz-prg2.suse.org:helo,suse.cz:email,suse.cz:mid] X-Spam-Flag: NO X-Spam-Score: -4.30 From: "Liam R. Howlett" liburcu doesn't have kfree_rcu (or anything similar). Despite that, we can hack around it in a trivial fashion, by adding a wrapper. The wrapper only works for maple_nodes because we cannot get the kmem_cache pointer any other way in the test code. Link: https://lore.kernel.org/all/20250812162124.59417-1-pfalcato@suse.de/ Suggested-by: Pedro Falcato Signed-off-by: Liam R. Howlett Signed-off-by: Vlastimil Babka --- tools/testing/shared/maple-shared.h | 11 +++++++++++ tools/testing/shared/maple-shim.c | 6 ++++++ 2 files changed, 17 insertions(+) diff --git a/tools/testing/shared/maple-shared.h b/tools/testing/shared/map= le-shared.h index dc4d30f3860b9bd23b4177c7d7926ac686887815..2a1e9a8594a2834326cd9374738= b2a2c7c3f9f7c 100644 --- a/tools/testing/shared/maple-shared.h +++ b/tools/testing/shared/maple-shared.h @@ -10,4 +10,15 @@ #include #include "linux/init.h" =20 +void maple_rcu_cb(struct rcu_head *head); +#define rcu_cb maple_rcu_cb + +#define kfree_rcu(_struct, _memb) \ +do { \ + typeof(_struct) _p_struct =3D (_struct); \ + \ + call_rcu(&((_p_struct)->_memb), rcu_cb); \ +} while(0); + + #endif /* __MAPLE_SHARED_H__ */ diff --git a/tools/testing/shared/maple-shim.c b/tools/testing/shared/maple= -shim.c index 9d7b743415660305416e972fa75b56824211b0eb..16252ee616c0489c80490ff25b8= d255427bf9fdc 100644 --- a/tools/testing/shared/maple-shim.c +++ b/tools/testing/shared/maple-shim.c @@ -6,3 +6,9 @@ #include =20 #include "../../../lib/maple_tree.c" + +void maple_rcu_cb(struct rcu_head *head) { + struct maple_node *node =3D container_of(head, struct maple_node, rcu); + + kmem_cache_free(maple_node_cache, node); +} --=20 2.51.0 From nobody Fri Oct 3 07:42:42 2025 Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.223.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D6D2530102B for ; Wed, 3 Sep 2025 13:01:36 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=195.135.223.131 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1756904498; cv=none; b=i5y9uqQ2+qZh3qIEM3LBbQ3551R3BLwTxnBeOJX79pG4twCxMn7UIf8EjxoXB2kTo79jVTcnilPIgsCClqctnE5gzwJ83uGzRNhPjjuuRWSqBXYDMexVIgnVcZH7fAyJ9W9Q8ARU89K1fMy8v5gGTCAIBzcEHB1wiBSYssfwlqs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1756904498; c=relaxed/simple; bh=NEnPKJoh4PSRWp8CZDKpOmH3tB/16GPOtmyVTP2ZnoU=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=tNujRAmYNVTpsqMASF/n1FYfUdcQ5q3I4kGrmFRV1PwP+yewxqiD8iEjuxHTF6zM/cQ8hLhc/4nJlRLtT+/nYqJsM07VM5b41BeB71XgXg9rYI4zCGOlioT5ePIggP0Urhq86hJKprUgtT3Ok3KH0d+7xlzXLwauCcqki2sSx5o= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=suse.cz; spf=pass smtp.mailfrom=suse.cz; dkim=pass (1024-bit key) header.d=suse.cz header.i=@suse.cz header.b=JpX1EFHl; dkim=permerror (0-bit key) header.d=suse.cz header.i=@suse.cz header.b=boqME7eI; dkim=pass (1024-bit key) header.d=suse.cz header.i=@suse.cz header.b=JpX1EFHl; dkim=permerror (0-bit key) header.d=suse.cz header.i=@suse.cz header.b=boqME7eI; arc=none smtp.client-ip=195.135.223.131 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=suse.cz Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=suse.cz Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=suse.cz header.i=@suse.cz header.b="JpX1EFHl"; dkim=permerror (0-bit key) header.d=suse.cz header.i=@suse.cz header.b="boqME7eI"; dkim=pass (1024-bit key) header.d=suse.cz header.i=@suse.cz header.b="JpX1EFHl"; dkim=permerror (0-bit key) header.d=suse.cz header.i=@suse.cz header.b="boqME7eI" Received: from imap1.dmz-prg2.suse.org (unknown [10.150.64.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id D808D1F749; Wed, 3 Sep 2025 13:00:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1756904410; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=4DoczhIJm/QIxVWZ7D37eQBiOD25npSD9wVr63TFoHc=; b=JpX1EFHlgY9lv/Jg9Np6YcccAcWwOH5vSSehOAl2iCisgztVji8k/m4iZWWDfF9G4MrMdT 1no34TX1PZX+TkQBLMAwYVtWVMaHPFyVU8Lzoa22w0y5YHWzIoAyZhPf0NVON8mUbSGOtp KivRwn6uQo+9Em2sG9jItp9gTXDpBSA= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1756904410; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=4DoczhIJm/QIxVWZ7D37eQBiOD25npSD9wVr63TFoHc=; b=boqME7eIeY8AGMTBsm7dwQZietZj8zprg+GhYZE9cE6XJPyNxjngG+iIitHglC+a4CF/48 pkAYS3zSHpyz4yDA== Authentication-Results: smtp-out2.suse.de; none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1756904410; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=4DoczhIJm/QIxVWZ7D37eQBiOD25npSD9wVr63TFoHc=; b=JpX1EFHlgY9lv/Jg9Np6YcccAcWwOH5vSSehOAl2iCisgztVji8k/m4iZWWDfF9G4MrMdT 1no34TX1PZX+TkQBLMAwYVtWVMaHPFyVU8Lzoa22w0y5YHWzIoAyZhPf0NVON8mUbSGOtp KivRwn6uQo+9Em2sG9jItp9gTXDpBSA= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1756904410; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=4DoczhIJm/QIxVWZ7D37eQBiOD25npSD9wVr63TFoHc=; b=boqME7eIeY8AGMTBsm7dwQZietZj8zprg+GhYZE9cE6XJPyNxjngG+iIitHglC+a4CF/48 pkAYS3zSHpyz4yDA== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id 675B913B08; Wed, 3 Sep 2025 13:00:10 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id sDzmGNo7uGitOAAAD6G6ig (envelope-from ); Wed, 03 Sep 2025 13:00:10 +0000 From: Vlastimil Babka Date: Wed, 03 Sep 2025 14:59:58 +0200 Subject: [PATCH v7 16/21] maple_tree: Use kfree_rcu in ma_free_rcu Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20250903-slub-percpu-caches-v7-16-71c114cdefef@suse.cz> References: <20250903-slub-percpu-caches-v7-0-71c114cdefef@suse.cz> In-Reply-To: <20250903-slub-percpu-caches-v7-0-71c114cdefef@suse.cz> To: Suren Baghdasaryan , "Liam R. Howlett" , Christoph Lameter , David Rientjes Cc: Roman Gushchin , Harry Yoo , Uladzislau Rezki , Sidhartha Kumar , linux-mm@kvack.org, linux-kernel@vger.kernel.org, rcu@vger.kernel.org, maple-tree@lists.infradead.org, vbabka@suse.cz, Pedro Falcato X-Mailer: b4 0.14.2 X-Spamd-Result: default: False [-4.30 / 50.00]; BAYES_HAM(-3.00)[100.00%]; NEURAL_HAM_LONG(-1.00)[-1.000]; NEURAL_HAM_SHORT(-0.20)[-0.999]; MIME_GOOD(-0.10)[text/plain]; RCVD_VIA_SMTP_AUTH(0.00)[]; ARC_NA(0.00)[]; FREEMAIL_ENVRCPT(0.00)[gmail.com]; MIME_TRACE(0.00)[0:+]; FUZZY_RATELIMITED(0.00)[rspamd.com]; TO_DN_SOME(0.00)[]; RCPT_COUNT_TWELVE(0.00)[14]; RCVD_TLS_ALL(0.00)[]; MID_RHS_MATCH_FROM(0.00)[]; DKIM_SIGNED(0.00)[suse.cz:s=susede2_rsa,suse.cz:s=susede2_ed25519]; FROM_HAS_DN(0.00)[]; FREEMAIL_CC(0.00)[linux.dev,oracle.com,gmail.com,kvack.org,vger.kernel.org,lists.infradead.org,suse.cz,suse.de]; R_RATELIMIT(0.00)[to(RL941jgdop1fyjkq8h4),to_ip_from(RLwn5r54y1cp81no5tmbbew5oc)]; FROM_EQ_ENVFROM(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; RCVD_COUNT_TWO(0.00)[2]; DBL_BLOCKED_OPENRESOLVER(0.00)[suse.cz:mid,suse.cz:email,suse.de:email,imap1.dmz-prg2.suse.org:helo] X-Spam-Flag: NO X-Spam-Level: X-Spam-Score: -4.30 From: Pedro Falcato kfree_rcu is an optimized version of call_rcu + kfree. It used to not be possible to call it on non-kmalloc objects, but this restriction was lifted ever since SLOB was dropped from the kernel, and since commit 6c6c47b063b5 ("mm, slab: call kvfree_rcu_barrier() from kmem_cache_destroy(= )"). Thus, replace call_rcu + mt_free_rcu with kfree_rcu. Signed-off-by: Pedro Falcato Signed-off-by: Vlastimil Babka --- lib/maple_tree.c | 13 +++---------- 1 file changed, 3 insertions(+), 10 deletions(-) diff --git a/lib/maple_tree.c b/lib/maple_tree.c index a0db6bdc63793b8bbd544e246391d99e880dede3..d77e82362f03905040ac61630f9= 2fe9af1e59f98 100644 --- a/lib/maple_tree.c +++ b/lib/maple_tree.c @@ -191,13 +191,6 @@ static inline void mt_free_bulk(size_t size, void __rc= u **nodes) kmem_cache_free_bulk(maple_node_cache, size, (void **)nodes); } =20 -static void mt_free_rcu(struct rcu_head *head) -{ - struct maple_node *node =3D container_of(head, struct maple_node, rcu); - - kmem_cache_free(maple_node_cache, node); -} - /* * ma_free_rcu() - Use rcu callback to free a maple node * @node: The node to free @@ -208,7 +201,7 @@ static void mt_free_rcu(struct rcu_head *head) static void ma_free_rcu(struct maple_node *node) { WARN_ON(node->parent !=3D ma_parent_ptr(node)); - call_rcu(&node->rcu, mt_free_rcu); + kfree_rcu(node, rcu); } =20 static void mt_set_height(struct maple_tree *mt, unsigned char height) @@ -5281,7 +5274,7 @@ static void mt_free_walk(struct rcu_head *head) mt_free_bulk(node->slot_len, slots); =20 free_leaf: - mt_free_rcu(&node->rcu); + mt_free_one(node); } =20 static inline void __rcu **mte_destroy_descend(struct maple_enode **enode, @@ -5365,7 +5358,7 @@ static void mt_destroy_walk(struct maple_enode *enode= , struct maple_tree *mt, =20 free_leaf: if (free) - mt_free_rcu(&node->rcu); + mt_free_one(node); else mt_clear_meta(mt, node, node->type); } --=20 2.51.0 From nobody Fri Oct 3 07:42:42 2025 Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.223.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0EF37309DA4 for ; Wed, 3 Sep 2025 13:01:45 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=195.135.223.131 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1756904507; cv=none; b=ZGf19Oe/mctdHQV+ZfKaX1SbExQgWt+Gh/iC0Te1Xfdmwz0Hg6z7iScqiQ511LY0COf7cmNxIQtQ223i+30jY/REbswpS3tHdweDHBnW1SKLpAouXrYaqWUjsoKasaAMY8xNr5b2LKmk1vfDYa6KiqBnk2hzvr11uEvUdWa3r80= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1756904507; c=relaxed/simple; bh=cbYZ4uYu2Ew4je+ZVdv6l5r3Tds6HvQmitLofhje3ds=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=XvN9pfWQKgboWFZrxPPN6etVvun3eZ2sSa/IYo7/8SyB2EdMXr27R20ARjzQsqY7WZhQrI9QPg9dZTdT+Gv0EYG7CLyeH5WrX57Iii7F/H0Mvg5v/SiMR1/Ddo9t9Brd6DVzJs93OtVOLiaRWk3Hk/H5ho19+5Nqud/ucPvSxOs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=suse.cz; spf=pass smtp.mailfrom=suse.cz; dkim=pass (1024-bit key) header.d=suse.cz header.i=@suse.cz header.b=LOjoGn2N; dkim=permerror (0-bit key) header.d=suse.cz header.i=@suse.cz header.b=tlcfQmu9; dkim=pass (1024-bit key) header.d=suse.cz header.i=@suse.cz header.b=LOjoGn2N; dkim=permerror (0-bit key) header.d=suse.cz header.i=@suse.cz header.b=tlcfQmu9; arc=none smtp.client-ip=195.135.223.131 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=suse.cz Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=suse.cz Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=suse.cz header.i=@suse.cz header.b="LOjoGn2N"; dkim=permerror (0-bit key) header.d=suse.cz header.i=@suse.cz header.b="tlcfQmu9"; dkim=pass (1024-bit key) header.d=suse.cz header.i=@suse.cz header.b="LOjoGn2N"; dkim=permerror (0-bit key) header.d=suse.cz header.i=@suse.cz header.b="tlcfQmu9" Received: from imap1.dmz-prg2.suse.org (unknown [10.150.64.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id D8BE81F74C; Wed, 3 Sep 2025 13:00:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1756904410; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Orc9L8mxDfBM2P4Cd87ufTnmwdvv431SxJfeIHzvdf8=; b=LOjoGn2NpQu0Skj44il6BTjGJzIstckK0pVvMsVotT3/OhaZGyHYwxQD8uT70gohtQNyAo XbSbSqC2twvWHgV9hJ7n8Ibm3dpue2uZrvecX09kBgtU//+j0BWZEBf1HLGz9m0mnzx6Nl 1OoK9JUReCORUdrabgNQtp8gqouT0P8= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1756904410; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Orc9L8mxDfBM2P4Cd87ufTnmwdvv431SxJfeIHzvdf8=; b=tlcfQmu9oBUZKp9gtiDFF3bLmdLERdTGZkoGbluYQxKKJck+WvjQZvVK/VOdD92MAqsmTv Bhj3OwnoUG4h0LAA== Authentication-Results: smtp-out2.suse.de; none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1756904410; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Orc9L8mxDfBM2P4Cd87ufTnmwdvv431SxJfeIHzvdf8=; b=LOjoGn2NpQu0Skj44il6BTjGJzIstckK0pVvMsVotT3/OhaZGyHYwxQD8uT70gohtQNyAo XbSbSqC2twvWHgV9hJ7n8Ibm3dpue2uZrvecX09kBgtU//+j0BWZEBf1HLGz9m0mnzx6Nl 1OoK9JUReCORUdrabgNQtp8gqouT0P8= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1756904410; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Orc9L8mxDfBM2P4Cd87ufTnmwdvv431SxJfeIHzvdf8=; b=tlcfQmu9oBUZKp9gtiDFF3bLmdLERdTGZkoGbluYQxKKJck+WvjQZvVK/VOdD92MAqsmTv Bhj3OwnoUG4h0LAA== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id 7C9D313A94; Wed, 3 Sep 2025 13:00:10 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id 0N8iHto7uGitOAAAD6G6ig (envelope-from ); Wed, 03 Sep 2025 13:00:10 +0000 From: Vlastimil Babka Date: Wed, 03 Sep 2025 14:59:59 +0200 Subject: [PATCH v7 17/21] maple_tree: Replace mt_free_one() with kfree() Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20250903-slub-percpu-caches-v7-17-71c114cdefef@suse.cz> References: <20250903-slub-percpu-caches-v7-0-71c114cdefef@suse.cz> In-Reply-To: <20250903-slub-percpu-caches-v7-0-71c114cdefef@suse.cz> To: Suren Baghdasaryan , "Liam R. Howlett" , Christoph Lameter , David Rientjes Cc: Roman Gushchin , Harry Yoo , Uladzislau Rezki , Sidhartha Kumar , linux-mm@kvack.org, linux-kernel@vger.kernel.org, rcu@vger.kernel.org, maple-tree@lists.infradead.org, vbabka@suse.cz, Pedro Falcato X-Mailer: b4 0.14.2 X-Spam-Level: X-Spamd-Result: default: False [-4.30 / 50.00]; BAYES_HAM(-3.00)[99.99%]; NEURAL_HAM_LONG(-1.00)[-1.000]; NEURAL_HAM_SHORT(-0.20)[-0.999]; MIME_GOOD(-0.10)[text/plain]; RCVD_VIA_SMTP_AUTH(0.00)[]; ARC_NA(0.00)[]; FREEMAIL_ENVRCPT(0.00)[gmail.com]; MIME_TRACE(0.00)[0:+]; FUZZY_RATELIMITED(0.00)[rspamd.com]; TO_DN_SOME(0.00)[]; RCPT_COUNT_TWELVE(0.00)[14]; RCVD_TLS_ALL(0.00)[]; MID_RHS_MATCH_FROM(0.00)[]; DKIM_SIGNED(0.00)[suse.cz:s=susede2_rsa,suse.cz:s=susede2_ed25519]; FROM_HAS_DN(0.00)[]; FREEMAIL_CC(0.00)[linux.dev,oracle.com,gmail.com,kvack.org,vger.kernel.org,lists.infradead.org,suse.cz,suse.de]; R_RATELIMIT(0.00)[to_ip_from(RLwn5r54y1cp81no5tmbbew5oc),to(RL941jgdop1fyjkq8h4)]; FROM_EQ_ENVFROM(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; RCVD_COUNT_TWO(0.00)[2]; DBL_BLOCKED_OPENRESOLVER(0.00)[suse.de:email,suse.cz:email,suse.cz:mid,imap1.dmz-prg2.suse.org:helo] X-Spam-Flag: NO X-Spam-Score: -4.30 From: Pedro Falcato kfree() is a little shorter and works with kmem_cache_alloc'd pointers too. Also lets us remove one more helper. Signed-off-by: Pedro Falcato Signed-off-by: Vlastimil Babka --- lib/maple_tree.c | 13 ++++--------- 1 file changed, 4 insertions(+), 9 deletions(-) diff --git a/lib/maple_tree.c b/lib/maple_tree.c index d77e82362f03905040ac61630f92fe9af1e59f98..b361b484cfcaacd99472dd4c2b8= de9260b307425 100644 --- a/lib/maple_tree.c +++ b/lib/maple_tree.c @@ -181,11 +181,6 @@ static inline int mt_alloc_bulk(gfp_t gfp, size_t size= , void **nodes) return kmem_cache_alloc_bulk(maple_node_cache, gfp, size, nodes); } =20 -static inline void mt_free_one(struct maple_node *node) -{ - kmem_cache_free(maple_node_cache, node); -} - static inline void mt_free_bulk(size_t size, void __rcu **nodes) { kmem_cache_free_bulk(maple_node_cache, size, (void **)nodes); @@ -5274,7 +5269,7 @@ static void mt_free_walk(struct rcu_head *head) mt_free_bulk(node->slot_len, slots); =20 free_leaf: - mt_free_one(node); + kfree(node); } =20 static inline void __rcu **mte_destroy_descend(struct maple_enode **enode, @@ -5358,7 +5353,7 @@ static void mt_destroy_walk(struct maple_enode *enode= , struct maple_tree *mt, =20 free_leaf: if (free) - mt_free_one(node); + kfree(node); else mt_clear_meta(mt, node, node->type); } @@ -5585,7 +5580,7 @@ void mas_destroy(struct ma_state *mas) mt_free_bulk(count, (void __rcu **)&node->slot[1]); total -=3D count; } - mt_free_one(ma_mnode_ptr(node)); + kfree(ma_mnode_ptr(node)); total--; } =20 @@ -6635,7 +6630,7 @@ static void mas_dup_free(struct ma_state *mas) } =20 node =3D mte_to_node(mas->node); - mt_free_one(node); + kfree(node); } =20 /* --=20 2.51.0 From nobody Fri Oct 3 07:42:42 2025 Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.223.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C3FC4301038 for ; Wed, 3 Sep 2025 13:01:26 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=195.135.223.131 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1756904489; cv=none; b=m0sLPebnsQFmq1ftIVRvk+sV0vNysXGt/2Z4+MeMZU55AEcaT8PWkMoKcyRkAvzPqwSiCcSBySvIwWYQRTC1KLuymlB43Ox4CzLwF/BxZO+d95E0kwlZmBke46i/OfsaH1HXQEGP4hdll7q+xMGPKmZEgwEDoy6YV9pVTOzanRg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1756904489; c=relaxed/simple; bh=hL/uYA3U34m1+gJpgX8WkZHdiwJVs5hstD7odj5T4U0=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=KnXTYJJORqTSX5HIEAIEMVe/OIsFFy0vAqcLka+z88tB6Y7W/ZsCRc9uaS3amsjI5eQ+QqUIjdcR57yQjsAzm2SJmDFuhifq1PvH3Um5uJxfI7XO9kBLG7cLa2yeSLDVGGWqNJlbYtvP+YXITahPGsIlszliGFuP+1HSaNzqwP8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=suse.cz; spf=pass smtp.mailfrom=suse.cz; dkim=pass (1024-bit key) header.d=suse.cz header.i=@suse.cz header.b=lcgO7m+f; dkim=permerror (0-bit key) header.d=suse.cz header.i=@suse.cz header.b=Vhzc6qP5; dkim=pass (1024-bit key) header.d=suse.cz header.i=@suse.cz header.b=lcgO7m+f; dkim=permerror (0-bit key) header.d=suse.cz header.i=@suse.cz header.b=Vhzc6qP5; arc=none smtp.client-ip=195.135.223.131 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=suse.cz Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=suse.cz Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=suse.cz header.i=@suse.cz header.b="lcgO7m+f"; dkim=permerror (0-bit key) header.d=suse.cz header.i=@suse.cz header.b="Vhzc6qP5"; dkim=pass (1024-bit key) header.d=suse.cz header.i=@suse.cz header.b="lcgO7m+f"; dkim=permerror (0-bit key) header.d=suse.cz header.i=@suse.cz header.b="Vhzc6qP5" Received: from imap1.dmz-prg2.suse.org (unknown [10.150.64.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id D8B421F74B; Wed, 3 Sep 2025 13:00:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1756904410; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Rqh/ojynYkYiIA62N7CbXckVi4xdirpy4Yia9XnPMB4=; b=lcgO7m+fb5soZLD6tTGdgqUf95yjYRNc7HcSIk2zwh9a7DMfyNwXiS/k0hc0jf4N/WsNer fk5yu6yq4BzKVp2UP6c2gbT6OWelwqx6iYGSXHL3RU1Y3cJbf8zormFeJYcw2qz9wgSyI6 VyiK4b1IORWktlKlbe6Hl5lHsm0wtF4= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1756904410; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Rqh/ojynYkYiIA62N7CbXckVi4xdirpy4Yia9XnPMB4=; b=Vhzc6qP5e9IkNDjeKP79Np6thntUaimCqyf9G1z9LZHScmEe3LuL74PwzIM1tmCnVqqnn6 iHdh3OliNB4T65Bw== Authentication-Results: smtp-out2.suse.de; none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1756904410; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Rqh/ojynYkYiIA62N7CbXckVi4xdirpy4Yia9XnPMB4=; b=lcgO7m+fb5soZLD6tTGdgqUf95yjYRNc7HcSIk2zwh9a7DMfyNwXiS/k0hc0jf4N/WsNer fk5yu6yq4BzKVp2UP6c2gbT6OWelwqx6iYGSXHL3RU1Y3cJbf8zormFeJYcw2qz9wgSyI6 VyiK4b1IORWktlKlbe6Hl5lHsm0wtF4= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1756904410; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Rqh/ojynYkYiIA62N7CbXckVi4xdirpy4Yia9XnPMB4=; b=Vhzc6qP5e9IkNDjeKP79Np6thntUaimCqyf9G1z9LZHScmEe3LuL74PwzIM1tmCnVqqnn6 iHdh3OliNB4T65Bw== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id 927DA13B09; Wed, 3 Sep 2025 13:00:10 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id GI14I9o7uGitOAAAD6G6ig (envelope-from ); Wed, 03 Sep 2025 13:00:10 +0000 From: Vlastimil Babka Date: Wed, 03 Sep 2025 15:00:00 +0200 Subject: [PATCH v7 18/21] tools/testing: Add support for prefilled slab sheafs Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20250903-slub-percpu-caches-v7-18-71c114cdefef@suse.cz> References: <20250903-slub-percpu-caches-v7-0-71c114cdefef@suse.cz> In-Reply-To: <20250903-slub-percpu-caches-v7-0-71c114cdefef@suse.cz> To: Suren Baghdasaryan , "Liam R. Howlett" , Christoph Lameter , David Rientjes Cc: Roman Gushchin , Harry Yoo , Uladzislau Rezki , Sidhartha Kumar , linux-mm@kvack.org, linux-kernel@vger.kernel.org, rcu@vger.kernel.org, maple-tree@lists.infradead.org, vbabka@suse.cz X-Mailer: b4 0.14.2 X-Spam-Level: X-Spamd-Result: default: False [-4.30 / 50.00]; BAYES_HAM(-3.00)[100.00%]; NEURAL_HAM_LONG(-1.00)[-1.000]; NEURAL_HAM_SHORT(-0.20)[-0.999]; MIME_GOOD(-0.10)[text/plain]; RCVD_VIA_SMTP_AUTH(0.00)[]; ARC_NA(0.00)[]; FREEMAIL_ENVRCPT(0.00)[gmail.com]; MIME_TRACE(0.00)[0:+]; FUZZY_RATELIMITED(0.00)[rspamd.com]; TO_DN_SOME(0.00)[]; RCPT_COUNT_TWELVE(0.00)[13]; RCVD_TLS_ALL(0.00)[]; MID_RHS_MATCH_FROM(0.00)[]; DKIM_SIGNED(0.00)[suse.cz:s=susede2_rsa,suse.cz:s=susede2_ed25519]; FROM_HAS_DN(0.00)[]; FREEMAIL_CC(0.00)[linux.dev,oracle.com,gmail.com,kvack.org,vger.kernel.org,lists.infradead.org,suse.cz]; R_RATELIMIT(0.00)[to_ip_from(RLwn5r54y1cp81no5tmbbew5oc)]; FROM_EQ_ENVFROM(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; RCVD_COUNT_TWO(0.00)[2]; DBL_BLOCKED_OPENRESOLVER(0.00)[oracle.com:email,suse.cz:email,suse.cz:mid,imap1.dmz-prg2.suse.org:helo] X-Spam-Flag: NO X-Spam-Score: -4.30 From: "Liam R. Howlett" Add the prefilled sheaf structs to the slab header and the associated functions to the testing/shared/linux.c file. Signed-off-by: Liam R. Howlett Signed-off-by: Vlastimil Babka --- tools/include/linux/slab.h | 28 ++++++++++++++ tools/testing/shared/linux.c | 89 ++++++++++++++++++++++++++++++++++++++++= ++++ 2 files changed, 117 insertions(+) diff --git a/tools/include/linux/slab.h b/tools/include/linux/slab.h index c5c5cc6db5668be2cc94c29065ccfa7ca7b4bb08..94937a699402bd1f31887dfb52b= 6fd0a3c986f43 100644 --- a/tools/include/linux/slab.h +++ b/tools/include/linux/slab.h @@ -123,6 +123,18 @@ struct kmem_cache_args { void (*ctor)(void *); }; =20 +struct slab_sheaf { + union { + struct list_head barn_list; + /* only used for prefilled sheafs */ + unsigned int capacity; + }; + struct kmem_cache *cache; + unsigned int size; + int node; /* only used for rcu_sheaf */ + void *objects[]; +}; + static inline void *kzalloc(size_t size, gfp_t gfp) { return kmalloc(size, gfp | __GFP_ZERO); @@ -173,5 +185,21 @@ __kmem_cache_create(const char *name, unsigned int siz= e, unsigned int align, void kmem_cache_free_bulk(struct kmem_cache *cachep, size_t size, void **l= ist); int kmem_cache_alloc_bulk(struct kmem_cache *cachep, gfp_t gfp, size_t siz= e, void **list); +struct slab_sheaf * +kmem_cache_prefill_sheaf(struct kmem_cache *s, gfp_t gfp, unsigned int siz= e); + +void * +kmem_cache_alloc_from_sheaf(struct kmem_cache *s, gfp_t gfp, + struct slab_sheaf *sheaf); + +void kmem_cache_return_sheaf(struct kmem_cache *s, gfp_t gfp, + struct slab_sheaf *sheaf); +int kmem_cache_refill_sheaf(struct kmem_cache *s, gfp_t gfp, + struct slab_sheaf **sheafp, unsigned int size); + +static inline unsigned int kmem_cache_sheaf_size(struct slab_sheaf *sheaf) +{ + return sheaf->size; +} =20 #endif /* _TOOLS_SLAB_H */ diff --git a/tools/testing/shared/linux.c b/tools/testing/shared/linux.c index 97b8412ccbb6d222604c7b397c53c65618d8d51b..4ceff7969b78cf8e33cd1e021c6= 8bc9f8a02a7a1 100644 --- a/tools/testing/shared/linux.c +++ b/tools/testing/shared/linux.c @@ -137,6 +137,12 @@ void kmem_cache_free_bulk(struct kmem_cache *cachep, s= ize_t size, void **list) if (kmalloc_verbose) pr_debug("Bulk free %p[0-%zu]\n", list, size - 1); =20 + if (cachep->exec_callback) { + if (cachep->callback) + cachep->callback(cachep->private); + cachep->exec_callback =3D false; + } + pthread_mutex_lock(&cachep->lock); for (int i =3D 0; i < size; i++) kmem_cache_free_locked(cachep, list[i]); @@ -242,6 +248,89 @@ __kmem_cache_create_args(const char *name, unsigned in= t size, return ret; } =20 +struct slab_sheaf * +kmem_cache_prefill_sheaf(struct kmem_cache *s, gfp_t gfp, unsigned int siz= e) +{ + struct slab_sheaf *sheaf; + unsigned int capacity; + + if (s->exec_callback) { + if (s->callback) + s->callback(s->private); + s->exec_callback =3D false; + } + + capacity =3D max(size, s->sheaf_capacity); + + sheaf =3D calloc(1, sizeof(*sheaf) + sizeof(void *) * capacity); + if (!sheaf) + return NULL; + + sheaf->cache =3D s; + sheaf->capacity =3D capacity; + sheaf->size =3D kmem_cache_alloc_bulk(s, gfp, size, sheaf->objects); + if (!sheaf->size) { + free(sheaf); + return NULL; + } + + return sheaf; +} + +int kmem_cache_refill_sheaf(struct kmem_cache *s, gfp_t gfp, + struct slab_sheaf **sheafp, unsigned int size) +{ + struct slab_sheaf *sheaf =3D *sheafp; + int refill; + + if (sheaf->size >=3D size) + return 0; + + if (size > sheaf->capacity) { + sheaf =3D kmem_cache_prefill_sheaf(s, gfp, size); + if (!sheaf) + return -ENOMEM; + + kmem_cache_return_sheaf(s, gfp, *sheafp); + *sheafp =3D sheaf; + return 0; + } + + refill =3D kmem_cache_alloc_bulk(s, gfp, size - sheaf->size, + &sheaf->objects[sheaf->size]); + if (!refill) + return -ENOMEM; + + sheaf->size +=3D refill; + return 0; +} + +void kmem_cache_return_sheaf(struct kmem_cache *s, gfp_t gfp, + struct slab_sheaf *sheaf) +{ + if (sheaf->size) + kmem_cache_free_bulk(s, sheaf->size, &sheaf->objects[0]); + + free(sheaf); +} + +void * +kmem_cache_alloc_from_sheaf(struct kmem_cache *s, gfp_t gfp, + struct slab_sheaf *sheaf) +{ + void *obj; + + if (sheaf->size =3D=3D 0) { + printf("Nothing left in sheaf!\n"); + return NULL; + } + + obj =3D sheaf->objects[--sheaf->size]; + sheaf->objects[sheaf->size] =3D NULL; + + return obj; +} + /* * Test the test infrastructure for kem_cache_alloc/free and bulk counterp= arts. */ --=20 2.51.0 From nobody Fri Oct 3 07:42:42 2025 Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.223.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AB8793019C7 for ; Wed, 3 Sep 2025 13:01:54 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=195.135.223.131 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1756904517; cv=none; b=jYfJUVjBqMmDcwRcf1O2vMSiiTdHa3Xo+YFxd5Tgq4FBkyp8wUYskd+ZuOAUjjyKuh6f9aXu8s4odJxRXhkbfCn9eSdmhc3o1ytSSeBaww02UaNb/fz5S04whuGqat5lMqCuVzEUNuZeJN2X6I1OK8i14gHote5qraqoco8Ko7E= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1756904517; c=relaxed/simple; bh=4MbU/ndqsU9rIaibB/aDJ9C+A4Ud56l5wiUiJjGNlpY=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=O9UND+QGrI2D0yAGcXsfGL0dZjzDNbtbCziZUHMLTlvFAZQg2HJvO1dafRuBXjmiheqShkQELMDmY8sqbrYknu8Rbqosd80dEcZsuhiDYKkPixkh8fMiV8qlJk/3t464iPCxKn4OQ2eYiT2hGVv8lSNjvw4xBl4NLAFmV1ALfPg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=suse.cz; spf=pass smtp.mailfrom=suse.cz; dkim=pass (1024-bit key) header.d=suse.cz header.i=@suse.cz header.b=WPm7uRoM; dkim=permerror (0-bit key) header.d=suse.cz header.i=@suse.cz header.b=l4Cjxfqy; dkim=pass (1024-bit key) header.d=suse.cz header.i=@suse.cz header.b=WPm7uRoM; dkim=permerror (0-bit key) header.d=suse.cz header.i=@suse.cz header.b=l4Cjxfqy; arc=none smtp.client-ip=195.135.223.131 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=suse.cz Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=suse.cz Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=suse.cz header.i=@suse.cz header.b="WPm7uRoM"; dkim=permerror (0-bit key) header.d=suse.cz header.i=@suse.cz header.b="l4Cjxfqy"; dkim=pass (1024-bit key) header.d=suse.cz header.i=@suse.cz header.b="WPm7uRoM"; dkim=permerror (0-bit key) header.d=suse.cz header.i=@suse.cz header.b="l4Cjxfqy" Received: from imap1.dmz-prg2.suse.org (unknown [10.150.64.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id D9F051F750; Wed, 3 Sep 2025 13:00:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1756904410; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Hm52WqvVVqHMnek513a3WUxA9fUfWZhAxDTco+ElhzE=; b=WPm7uRoM9fRFAN0kplZtUP7zV7s4oBEZG5FOgEEVg6DUZRbzF0c0KVR69+5jXvj++U3iXz hB1qLjbLl3K6miGUVpJhpQGAKY1kZsWMd3+ak5q4eG0Aos8wMRXa1LVTVVbc/Y3fPCwUd8 ljHki3zWJZnnpoLlFVcXVjtuu+L/3fw= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1756904410; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Hm52WqvVVqHMnek513a3WUxA9fUfWZhAxDTco+ElhzE=; b=l4Cjxfqy6iv4wAPe9aGwYoKpQgaC9EgGjk6CMt6ONt/bMWq23BAOIKx+bA+lCBaxm4o8Mb N3HZkz+tTyNHlRAw== Authentication-Results: smtp-out2.suse.de; none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1756904410; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Hm52WqvVVqHMnek513a3WUxA9fUfWZhAxDTco+ElhzE=; b=WPm7uRoM9fRFAN0kplZtUP7zV7s4oBEZG5FOgEEVg6DUZRbzF0c0KVR69+5jXvj++U3iXz hB1qLjbLl3K6miGUVpJhpQGAKY1kZsWMd3+ak5q4eG0Aos8wMRXa1LVTVVbc/Y3fPCwUd8 ljHki3zWJZnnpoLlFVcXVjtuu+L/3fw= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1756904410; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Hm52WqvVVqHMnek513a3WUxA9fUfWZhAxDTco+ElhzE=; b=l4Cjxfqy6iv4wAPe9aGwYoKpQgaC9EgGjk6CMt6ONt/bMWq23BAOIKx+bA+lCBaxm4o8Mb N3HZkz+tTyNHlRAw== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id A6FF213A3B; Wed, 3 Sep 2025 13:00:10 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id 0LR/KNo7uGitOAAAD6G6ig (envelope-from ); Wed, 03 Sep 2025 13:00:10 +0000 From: Vlastimil Babka Date: Wed, 03 Sep 2025 15:00:01 +0200 Subject: [PATCH v7 19/21] maple_tree: Prefilled sheaf conversion and testing Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20250903-slub-percpu-caches-v7-19-71c114cdefef@suse.cz> References: <20250903-slub-percpu-caches-v7-0-71c114cdefef@suse.cz> In-Reply-To: <20250903-slub-percpu-caches-v7-0-71c114cdefef@suse.cz> To: Suren Baghdasaryan , "Liam R. Howlett" , Christoph Lameter , David Rientjes Cc: Roman Gushchin , Harry Yoo , Uladzislau Rezki , Sidhartha Kumar , linux-mm@kvack.org, linux-kernel@vger.kernel.org, rcu@vger.kernel.org, maple-tree@lists.infradead.org, vbabka@suse.cz X-Mailer: b4 0.14.2 X-Spamd-Result: default: False [-4.30 / 50.00]; BAYES_HAM(-3.00)[100.00%]; NEURAL_HAM_LONG(-1.00)[-1.000]; NEURAL_HAM_SHORT(-0.20)[-0.999]; MIME_GOOD(-0.10)[text/plain]; RCVD_VIA_SMTP_AUTH(0.00)[]; ARC_NA(0.00)[]; FREEMAIL_ENVRCPT(0.00)[gmail.com]; MIME_TRACE(0.00)[0:+]; FUZZY_RATELIMITED(0.00)[rspamd.com]; TO_DN_SOME(0.00)[]; RCPT_COUNT_TWELVE(0.00)[13]; RCVD_TLS_ALL(0.00)[]; MID_RHS_MATCH_FROM(0.00)[]; DKIM_SIGNED(0.00)[suse.cz:s=susede2_rsa,suse.cz:s=susede2_ed25519]; FROM_HAS_DN(0.00)[]; FREEMAIL_CC(0.00)[linux.dev,oracle.com,gmail.com,kvack.org,vger.kernel.org,lists.infradead.org,suse.cz]; R_RATELIMIT(0.00)[to(RL941jgdop1fyjkq8h4),to_ip_from(RLwn5r54y1cp81no5tmbbew5oc)]; FROM_EQ_ENVFROM(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; RCVD_COUNT_TWO(0.00)[2]; DBL_BLOCKED_OPENRESOLVER(0.00)[imap1.dmz-prg2.suse.org:helo,suse.cz:mid,suse.cz:email,oracle.com:email] X-Spam-Flag: NO X-Spam-Level: X-Spam-Score: -4.30 From: "Liam R. Howlett" Use prefilled sheaves instead of bulk allocations. This should speed up the allocations and the return path of unused allocations. Remove the push and pop of nodes from the maple state as this is now handled by the slab layer with sheaves. Testing has been removed as necessary since the features of the tree have been reduced. Signed-off-by: Liam R. Howlett Signed-off-by: Vlastimil Babka --- include/linux/maple_tree.h | 6 +- lib/maple_tree.c | 329 ++++++--------------------- lib/test_maple_tree.c | 8 + tools/testing/radix-tree/maple.c | 464 ++---------------------------------= ---- tools/testing/shared/linux.c | 5 +- 5 files changed, 99 insertions(+), 713 deletions(-) diff --git a/include/linux/maple_tree.h b/include/linux/maple_tree.h index bafe143b1f783202e27b32567fffee4149e8e266..166fd67e00d882b1e6de1f80c1b= 590bba7497cd3 100644 --- a/include/linux/maple_tree.h +++ b/include/linux/maple_tree.h @@ -442,7 +442,8 @@ struct ma_state { struct maple_enode *node; /* The node containing this entry */ unsigned long min; /* The minimum index of this node - implied pivot min= */ unsigned long max; /* The maximum index of this node - implied pivot max= */ - struct maple_alloc *alloc; /* Allocated nodes for this operation */ + struct slab_sheaf *sheaf; /* Allocated nodes for this operation */ + unsigned long node_request; enum maple_status status; /* The status of the state (active, start, none= , etc) */ unsigned char depth; /* depth of tree descent during write */ unsigned char offset; @@ -490,7 +491,8 @@ struct ma_wr_state { .status =3D ma_start, \ .min =3D 0, \ .max =3D ULONG_MAX, \ - .alloc =3D NULL, \ + .node_request=3D 0, \ + .sheaf =3D NULL, \ .mas_flags =3D 0, \ .store_type =3D wr_invalid, \ } diff --git a/lib/maple_tree.c b/lib/maple_tree.c index b361b484cfcaacd99472dd4c2b8de9260b307425..3f14bfa7fe1c20aac3e127f0def= d268c3dbca6aa 100644 --- a/lib/maple_tree.c +++ b/lib/maple_tree.c @@ -186,6 +186,22 @@ static inline void mt_free_bulk(size_t size, void __rc= u **nodes) kmem_cache_free_bulk(maple_node_cache, size, (void **)nodes); } =20 +static void mt_return_sheaf(struct slab_sheaf *sheaf) +{ + kmem_cache_return_sheaf(maple_node_cache, GFP_KERNEL, sheaf); +} + +static struct slab_sheaf *mt_get_sheaf(gfp_t gfp, int count) +{ + return kmem_cache_prefill_sheaf(maple_node_cache, gfp, count); +} + +static int mt_refill_sheaf(gfp_t gfp, struct slab_sheaf **sheaf, + unsigned int size) +{ + return kmem_cache_refill_sheaf(maple_node_cache, gfp, sheaf, size); +} + /* * ma_free_rcu() - Use rcu callback to free a maple node * @node: The node to free @@ -578,67 +594,6 @@ static __always_inline bool mte_dead_node(const struct= maple_enode *enode) return ma_dead_node(node); } =20 -/* - * mas_allocated() - Get the number of nodes allocated in a maple state. - * @mas: The maple state - * - * The ma_state alloc member is overloaded to hold a pointer to the first - * allocated node or to the number of requested nodes to allocate. If bit= 0 is - * set, then the alloc contains the number of requested nodes. If there i= s an - * allocated node, then the total allocated nodes is in that node. - * - * Return: The total number of nodes allocated - */ -static inline unsigned long mas_allocated(const struct ma_state *mas) -{ - if (!mas->alloc || ((unsigned long)mas->alloc & 0x1)) - return 0; - - return mas->alloc->total; -} - -/* - * mas_set_alloc_req() - Set the requested number of allocations. - * @mas: the maple state - * @count: the number of allocations. - * - * The requested number of allocations is either in the first allocated no= de, - * located in @mas->alloc->request_count, or directly in @mas->alloc if th= ere is - * no allocated node. Set the request either in the node or do the necess= ary - * encoding to store in @mas->alloc directly. - */ -static inline void mas_set_alloc_req(struct ma_state *mas, unsigned long c= ount) -{ - if (!mas->alloc || ((unsigned long)mas->alloc & 0x1)) { - if (!count) - mas->alloc =3D NULL; - else - mas->alloc =3D (struct maple_alloc *)(((count) << 1U) | 1U); - return; - } - - mas->alloc->request_count =3D count; -} - -/* - * mas_alloc_req() - get the requested number of allocations. - * @mas: The maple state - * - * The alloc count is either stored directly in @mas, or in - * @mas->alloc->request_count if there is at least one node allocated. De= code - * the request count if it's stored directly in @mas->alloc. - * - * Return: The allocation request count. - */ -static inline unsigned int mas_alloc_req(const struct ma_state *mas) -{ - if ((unsigned long)mas->alloc & 0x1) - return (unsigned long)(mas->alloc) >> 1; - else if (mas->alloc) - return mas->alloc->request_count; - return 0; -} - /* * ma_pivots() - Get a pointer to the maple node pivots. * @node: the maple node @@ -1142,77 +1097,15 @@ static int mas_ascend(struct ma_state *mas) */ static inline struct maple_node *mas_pop_node(struct ma_state *mas) { - struct maple_alloc *ret, *node =3D mas->alloc; - unsigned long total =3D mas_allocated(mas); - unsigned int req =3D mas_alloc_req(mas); + struct maple_node *ret; =20 - /* nothing or a request pending. */ - if (WARN_ON(!total)) + if (WARN_ON_ONCE(!mas->sheaf)) return NULL; =20 - if (total =3D=3D 1) { - /* single allocation in this ma_state */ - mas->alloc =3D NULL; - ret =3D node; - goto single_node; - } - - if (node->node_count =3D=3D 1) { - /* Single allocation in this node. */ - mas->alloc =3D node->slot[0]; - mas->alloc->total =3D node->total - 1; - ret =3D node; - goto new_head; - } - node->total--; - ret =3D node->slot[--node->node_count]; - node->slot[node->node_count] =3D NULL; - -single_node: -new_head: - if (req) { - req++; - mas_set_alloc_req(mas, req); - } - + ret =3D kmem_cache_alloc_from_sheaf(maple_node_cache, GFP_NOWAIT, mas->sh= eaf); memset(ret, 0, sizeof(*ret)); - return (struct maple_node *)ret; -} - -/* - * mas_push_node() - Push a node back on the maple state allocation. - * @mas: The maple state - * @used: The used maple node - * - * Stores the maple node back into @mas->alloc for reuse. Updates allocat= ed and - * requested node count as necessary. - */ -static inline void mas_push_node(struct ma_state *mas, struct maple_node *= used) -{ - struct maple_alloc *reuse =3D (struct maple_alloc *)used; - struct maple_alloc *head =3D mas->alloc; - unsigned long count; - unsigned int requested =3D mas_alloc_req(mas); =20 - count =3D mas_allocated(mas); - - reuse->request_count =3D 0; - reuse->node_count =3D 0; - if (count) { - if (head->node_count < MAPLE_ALLOC_SLOTS) { - head->slot[head->node_count++] =3D reuse; - head->total++; - goto done; - } - reuse->slot[0] =3D head; - reuse->node_count =3D 1; - } - - reuse->total =3D count + 1; - mas->alloc =3D reuse; -done: - if (requested > 1) - mas_set_alloc_req(mas, requested - 1); + return ret; } =20 /* @@ -1222,75 +1115,32 @@ static inline void mas_push_node(struct ma_state *m= as, struct maple_node *used) */ static inline void mas_alloc_nodes(struct ma_state *mas, gfp_t gfp) { - struct maple_alloc *node; - unsigned long allocated =3D mas_allocated(mas); - unsigned int requested =3D mas_alloc_req(mas); - unsigned int count; - void **slots =3D NULL; - unsigned int max_req =3D 0; - - if (!requested) - return; + if (unlikely(mas->sheaf)) { + unsigned long refill =3D mas->node_request; =20 - mas_set_alloc_req(mas, 0); - if (mas->mas_flags & MA_STATE_PREALLOC) { - if (allocated) + if(kmem_cache_sheaf_size(mas->sheaf) >=3D refill) { + mas->node_request =3D 0; return; - WARN_ON(!allocated); - } - - if (!allocated || mas->alloc->node_count =3D=3D MAPLE_ALLOC_SLOTS) { - node =3D (struct maple_alloc *)mt_alloc_one(gfp); - if (!node) - goto nomem_one; - - if (allocated) { - node->slot[0] =3D mas->alloc; - node->node_count =3D 1; - } else { - node->node_count =3D 0; } =20 - mas->alloc =3D node; - node->total =3D ++allocated; - node->request_count =3D 0; - requested--; - } + if (mt_refill_sheaf(gfp, &mas->sheaf, refill)) + goto error; =20 - node =3D mas->alloc; - while (requested) { - max_req =3D MAPLE_ALLOC_SLOTS - node->node_count; - slots =3D (void **)&node->slot[node->node_count]; - max_req =3D min(requested, max_req); - count =3D mt_alloc_bulk(gfp, max_req, slots); - if (!count) - goto nomem_bulk; - - if (node->node_count =3D=3D 0) { - node->slot[0]->node_count =3D 0; - node->slot[0]->request_count =3D 0; - } + mas->node_request =3D 0; + return; + } =20 - node->node_count +=3D count; - allocated +=3D count; - /* find a non-full node*/ - do { - node =3D node->slot[0]; - } while (unlikely(node->node_count =3D=3D MAPLE_ALLOC_SLOTS)); - requested -=3D count; + mas->sheaf =3D mt_get_sheaf(gfp, mas->node_request); + if (likely(mas->sheaf)) { + mas->node_request =3D 0; + return; } - mas->alloc->total =3D allocated; - return; =20 -nomem_bulk: - /* Clean up potential freed allocations on bulk failure */ - memset(slots, 0, max_req * sizeof(unsigned long)); - mas->alloc->total =3D allocated; -nomem_one: - mas_set_alloc_req(mas, requested); +error: mas_set_err(mas, -ENOMEM); } =20 + /* * mas_free() - Free an encoded maple node * @mas: The maple state @@ -1301,42 +1151,7 @@ static inline void mas_alloc_nodes(struct ma_state *= mas, gfp_t gfp) */ static inline void mas_free(struct ma_state *mas, struct maple_enode *used) { - struct maple_node *tmp =3D mte_to_node(used); - - if (mt_in_rcu(mas->tree)) - ma_free_rcu(tmp); - else - mas_push_node(mas, tmp); -} - -/* - * mas_node_count_gfp() - Check if enough nodes are allocated and request = more - * if there is not enough nodes. - * @mas: The maple state - * @count: The number of nodes needed - * @gfp: the gfp flags - */ -static void mas_node_count_gfp(struct ma_state *mas, int count, gfp_t gfp) -{ - unsigned long allocated =3D mas_allocated(mas); - - if (allocated < count) { - mas_set_alloc_req(mas, count - allocated); - mas_alloc_nodes(mas, gfp); - } -} - -/* - * mas_node_count() - Check if enough nodes are allocated and request more= if - * there is not enough nodes. - * @mas: The maple state - * @count: The number of nodes needed - * - * Note: Uses GFP_NOWAIT | __GFP_NOWARN for gfp flags. - */ -static void mas_node_count(struct ma_state *mas, int count) -{ - return mas_node_count_gfp(mas, count, GFP_NOWAIT | __GFP_NOWARN); + ma_free_rcu(mte_to_node(used)); } =20 /* @@ -2511,10 +2326,7 @@ static inline void mas_topiary_node(struct ma_state = *mas, enode =3D tmp_mas->node; tmp =3D mte_to_node(enode); mte_set_node_dead(enode); - if (in_rcu) - ma_free_rcu(tmp); - else - mas_push_node(mas, tmp); + ma_free_rcu(tmp); } =20 /* @@ -4162,7 +3974,7 @@ static inline void mas_wr_prealloc_setup(struct ma_wr= _state *wr_mas) * * Return: Number of nodes required for preallocation. */ -static inline int mas_prealloc_calc(struct ma_wr_state *wr_mas, void *entr= y) +static inline void mas_prealloc_calc(struct ma_wr_state *wr_mas, void *ent= ry) { struct ma_state *mas =3D wr_mas->mas; unsigned char height =3D mas_mt_height(mas); @@ -4208,7 +4020,7 @@ static inline int mas_prealloc_calc(struct ma_wr_stat= e *wr_mas, void *entry) WARN_ON_ONCE(1); } =20 - return ret; + mas->node_request =3D ret; } =20 /* @@ -4269,15 +4081,15 @@ static inline enum store_type mas_wr_store_type(str= uct ma_wr_state *wr_mas) */ static inline void mas_wr_preallocate(struct ma_wr_state *wr_mas, void *en= try) { - int request; + struct ma_state *mas =3D wr_mas->mas; =20 mas_wr_prealloc_setup(wr_mas); - wr_mas->mas->store_type =3D mas_wr_store_type(wr_mas); - request =3D mas_prealloc_calc(wr_mas, entry); - if (!request) + mas->store_type =3D mas_wr_store_type(wr_mas); + mas_prealloc_calc(wr_mas, entry); + if (!mas->node_request) return; =20 - mas_node_count(wr_mas->mas, request); + mas_alloc_nodes(mas, GFP_NOWAIT | __GFP_NOWARN); } =20 /** @@ -5390,7 +5202,6 @@ static inline void mte_destroy_walk(struct maple_enod= e *enode, */ void *mas_store(struct ma_state *mas, void *entry) { - int request; MA_WR_STATE(wr_mas, mas, entry); =20 trace_ma_write(__func__, mas, 0, entry); @@ -5420,11 +5231,11 @@ void *mas_store(struct ma_state *mas, void *entry) return wr_mas.content; } =20 - request =3D mas_prealloc_calc(&wr_mas, entry); - if (!request) + mas_prealloc_calc(&wr_mas, entry); + if (!mas->node_request) goto store; =20 - mas_node_count(mas, request); + mas_alloc_nodes(mas, GFP_NOWAIT | __GFP_NOWARN); if (mas_is_err(mas)) return NULL; =20 @@ -5512,20 +5323,19 @@ EXPORT_SYMBOL_GPL(mas_store_prealloc); int mas_preallocate(struct ma_state *mas, void *entry, gfp_t gfp) { MA_WR_STATE(wr_mas, mas, entry); - int ret =3D 0; - int request; =20 mas_wr_prealloc_setup(&wr_mas); mas->store_type =3D mas_wr_store_type(&wr_mas); - request =3D mas_prealloc_calc(&wr_mas, entry); - if (!request) + mas_prealloc_calc(&wr_mas, entry); + if (!mas->node_request) goto set_flag; =20 mas->mas_flags &=3D ~MA_STATE_PREALLOC; - mas_node_count_gfp(mas, request, gfp); + mas_alloc_nodes(mas, gfp); if (mas_is_err(mas)) { - mas_set_alloc_req(mas, 0); - ret =3D xa_err(mas->node); + int ret =3D xa_err(mas->node); + + mas->node_request =3D 0; mas_destroy(mas); mas_reset(mas); return ret; @@ -5533,7 +5343,7 @@ int mas_preallocate(struct ma_state *mas, void *entry= , gfp_t gfp) =20 set_flag: mas->mas_flags |=3D MA_STATE_PREALLOC; - return ret; + return 0; } EXPORT_SYMBOL_GPL(mas_preallocate); =20 @@ -5547,9 +5357,6 @@ EXPORT_SYMBOL_GPL(mas_preallocate); */ void mas_destroy(struct ma_state *mas) { - struct maple_alloc *node; - unsigned long total; - /* * When using mas_for_each() to insert an expected number of elements, * it is possible that the number inserted is less than the expected @@ -5570,21 +5377,11 @@ void mas_destroy(struct ma_state *mas) } mas->mas_flags &=3D ~(MA_STATE_BULK|MA_STATE_PREALLOC); =20 - total =3D mas_allocated(mas); - while (total) { - node =3D mas->alloc; - mas->alloc =3D node->slot[0]; - if (node->node_count > 1) { - size_t count =3D node->node_count - 1; - - mt_free_bulk(count, (void __rcu **)&node->slot[1]); - total -=3D count; - } - kfree(ma_mnode_ptr(node)); - total--; - } + mas->node_request =3D 0; + if (mas->sheaf) + mt_return_sheaf(mas->sheaf); =20 - mas->alloc =3D NULL; + mas->sheaf =3D NULL; } EXPORT_SYMBOL_GPL(mas_destroy); =20 @@ -5634,7 +5431,8 @@ int mas_expected_entries(struct ma_state *mas, unsign= ed long nr_entries) /* Internal nodes */ nr_nodes +=3D DIV_ROUND_UP(nr_nodes, nonleaf_cap); /* Add working room for split (2 nodes) + new parents */ - mas_node_count_gfp(mas, nr_nodes + 3, GFP_KERNEL); + mas->node_request =3D nr_nodes + 3; + mas_alloc_nodes(mas, GFP_KERNEL); =20 /* Detect if allocations run out */ mas->mas_flags |=3D MA_STATE_PREALLOC; @@ -6281,7 +6079,7 @@ bool mas_nomem(struct ma_state *mas, gfp_t gfp) mas_alloc_nodes(mas, gfp); } =20 - if (!mas_allocated(mas)) + if (!mas->sheaf) return false; =20 mas->status =3D ma_start; @@ -7676,8 +7474,9 @@ void mas_dump(const struct ma_state *mas) =20 pr_err("[%u/%u] index=3D%lx last=3D%lx\n", mas->offset, mas->end, mas->index, mas->last); - pr_err(" min=3D%lx max=3D%lx alloc=3D" PTR_FMT ", depth=3D%u, flags= =3D%x\n", - mas->min, mas->max, mas->alloc, mas->depth, mas->mas_flags); + pr_err(" min=3D%lx max=3D%lx sheaf=3D" PTR_FMT ", request %lu depth= =3D%u, flags=3D%x\n", + mas->min, mas->max, mas->sheaf, mas->node_request, mas->depth, + mas->mas_flags); if (mas->index > mas->last) pr_err("Check index & last\n"); } diff --git a/lib/test_maple_tree.c b/lib/test_maple_tree.c index cb3936595b0d56a9682ff100eba54693a1427829..1848d127eb50650e7cc2b9dfbb1= 5ed93aa889f01 100644 --- a/lib/test_maple_tree.c +++ b/lib/test_maple_tree.c @@ -2746,6 +2746,7 @@ static noinline void __init check_fuzzer(struct maple= _tree *mt) mtree_test_erase(mt, ULONG_MAX - 10); } =20 +#if 0 /* duplicate the tree with a specific gap */ static noinline void __init check_dup_gaps(struct maple_tree *mt, unsigned long nr_entries, bool zero_start, @@ -2770,6 +2771,7 @@ static noinline void __init check_dup_gaps(struct map= le_tree *mt, mtree_store_range(mt, i*10, (i+1)*10 - gap, xa_mk_value(i), GFP_KERNEL); =20 + mt_dump(mt, mt_dump_dec); mt_init_flags(&newmt, MT_FLAGS_ALLOC_RANGE | MT_FLAGS_LOCK_EXTERN); mt_set_non_kernel(99999); down_write(&newmt_lock); @@ -2779,9 +2781,12 @@ static noinline void __init check_dup_gaps(struct ma= ple_tree *mt, =20 rcu_read_lock(); mas_for_each(&mas, tmp, ULONG_MAX) { + printk("%lu nodes %lu\n", mas.index, + kmem_cache_sheaf_count(newmas.sheaf)); newmas.index =3D mas.index; newmas.last =3D mas.last; mas_store(&newmas, tmp); + mt_dump(&newmt, mt_dump_dec); } rcu_read_unlock(); mas_destroy(&newmas); @@ -2878,6 +2883,7 @@ static noinline void __init check_dup(struct maple_tr= ee *mt) cond_resched(); } } +#endif =20 static noinline void __init check_bnode_min_spanning(struct maple_tree *mt) { @@ -4077,9 +4083,11 @@ static int __init maple_tree_seed(void) check_fuzzer(&tree); mtree_destroy(&tree); =20 +#if 0 mt_init_flags(&tree, MT_FLAGS_ALLOC_RANGE); check_dup(&tree); mtree_destroy(&tree); +#endif =20 mt_init_flags(&tree, MT_FLAGS_ALLOC_RANGE); check_bnode_min_spanning(&tree); diff --git a/tools/testing/radix-tree/maple.c b/tools/testing/radix-tree/ma= ple.c index 7fe91f24849b35723ec6aadbe45ec7d2abedcc11..da3e03d73b52162dab6fa5c368a= d7b71b9e58521 100644 --- a/tools/testing/radix-tree/maple.c +++ b/tools/testing/radix-tree/maple.c @@ -57,430 +57,6 @@ struct rcu_reader_struct { struct rcu_test_struct2 *test; }; =20 -static int get_alloc_node_count(struct ma_state *mas) -{ - int count =3D 1; - struct maple_alloc *node =3D mas->alloc; - - if (!node || ((unsigned long)node & 0x1)) - return 0; - while (node->node_count) { - count +=3D node->node_count; - node =3D node->slot[0]; - } - return count; -} - -static void check_mas_alloc_node_count(struct ma_state *mas) -{ - mas_node_count_gfp(mas, MAPLE_ALLOC_SLOTS + 1, GFP_KERNEL); - mas_node_count_gfp(mas, MAPLE_ALLOC_SLOTS + 3, GFP_KERNEL); - MT_BUG_ON(mas->tree, get_alloc_node_count(mas) !=3D mas->alloc->total); - mas_destroy(mas); -} - -/* - * check_new_node() - Check the creation of new nodes and error path - * verification. - */ -static noinline void __init check_new_node(struct maple_tree *mt) -{ - - struct maple_node *mn, *mn2, *mn3; - struct maple_alloc *smn; - struct maple_node *nodes[100]; - int i, j, total; - - MA_STATE(mas, mt, 0, 0); - - check_mas_alloc_node_count(&mas); - - /* Try allocating 3 nodes */ - mtree_lock(mt); - mt_set_non_kernel(0); - /* request 3 nodes to be allocated. */ - mas_node_count(&mas, 3); - /* Allocation request of 3. */ - MT_BUG_ON(mt, mas_alloc_req(&mas) !=3D 3); - /* Allocate failed. */ - MT_BUG_ON(mt, mas.node !=3D MA_ERROR(-ENOMEM)); - MT_BUG_ON(mt, !mas_nomem(&mas, GFP_KERNEL)); - - MT_BUG_ON(mt, mas_allocated(&mas) !=3D 3); - mn =3D mas_pop_node(&mas); - MT_BUG_ON(mt, not_empty(mn)); - MT_BUG_ON(mt, mn =3D=3D NULL); - MT_BUG_ON(mt, mas.alloc =3D=3D NULL); - MT_BUG_ON(mt, mas.alloc->slot[0] =3D=3D NULL); - mas_push_node(&mas, mn); - mas_reset(&mas); - mas_destroy(&mas); - mtree_unlock(mt); - - - /* Try allocating 1 node, then 2 more */ - mtree_lock(mt); - /* Set allocation request to 1. */ - mas_set_alloc_req(&mas, 1); - /* Check Allocation request of 1. */ - MT_BUG_ON(mt, mas_alloc_req(&mas) !=3D 1); - mas_set_err(&mas, -ENOMEM); - /* Validate allocation request. */ - MT_BUG_ON(mt, !mas_nomem(&mas, GFP_KERNEL)); - /* Eat the requested node. */ - mn =3D mas_pop_node(&mas); - MT_BUG_ON(mt, not_empty(mn)); - MT_BUG_ON(mt, mn =3D=3D NULL); - MT_BUG_ON(mt, mn->slot[0] !=3D NULL); - MT_BUG_ON(mt, mn->slot[1] !=3D NULL); - MT_BUG_ON(mt, mas_allocated(&mas) !=3D 0); - - mn->parent =3D ma_parent_ptr(mn); - ma_free_rcu(mn); - mas.status =3D ma_start; - mas_destroy(&mas); - /* Allocate 3 nodes, will fail. */ - mas_node_count(&mas, 3); - /* Drop the lock and allocate 3 nodes. */ - mas_nomem(&mas, GFP_KERNEL); - /* Ensure 3 are allocated. */ - MT_BUG_ON(mt, mas_allocated(&mas) !=3D 3); - /* Allocation request of 0. */ - MT_BUG_ON(mt, mas_alloc_req(&mas) !=3D 0); - - MT_BUG_ON(mt, mas.alloc =3D=3D NULL); - MT_BUG_ON(mt, mas.alloc->slot[0] =3D=3D NULL); - MT_BUG_ON(mt, mas.alloc->slot[1] =3D=3D NULL); - /* Ensure we counted 3. */ - MT_BUG_ON(mt, mas_allocated(&mas) !=3D 3); - /* Free. */ - mas_reset(&mas); - mas_destroy(&mas); - - /* Set allocation request to 1. */ - mas_set_alloc_req(&mas, 1); - MT_BUG_ON(mt, mas_alloc_req(&mas) !=3D 1); - mas_set_err(&mas, -ENOMEM); - /* Validate allocation request. */ - MT_BUG_ON(mt, !mas_nomem(&mas, GFP_KERNEL)); - MT_BUG_ON(mt, mas_allocated(&mas) !=3D 1); - /* Check the node is only one node. */ - mn =3D mas_pop_node(&mas); - MT_BUG_ON(mt, not_empty(mn)); - MT_BUG_ON(mt, mas_allocated(&mas) !=3D 0); - MT_BUG_ON(mt, mn =3D=3D NULL); - MT_BUG_ON(mt, mn->slot[0] !=3D NULL); - MT_BUG_ON(mt, mn->slot[1] !=3D NULL); - MT_BUG_ON(mt, mas_allocated(&mas) !=3D 0); - mas_push_node(&mas, mn); - MT_BUG_ON(mt, mas_allocated(&mas) !=3D 1); - MT_BUG_ON(mt, mas.alloc->node_count); - - mas_set_alloc_req(&mas, 2); /* request 2 more. */ - MT_BUG_ON(mt, mas_alloc_req(&mas) !=3D 2); - mas_set_err(&mas, -ENOMEM); - MT_BUG_ON(mt, !mas_nomem(&mas, GFP_KERNEL)); - MT_BUG_ON(mt, mas_allocated(&mas) !=3D 3); - MT_BUG_ON(mt, mas.alloc =3D=3D NULL); - MT_BUG_ON(mt, mas.alloc->slot[0] =3D=3D NULL); - MT_BUG_ON(mt, mas.alloc->slot[1] =3D=3D NULL); - for (i =3D 2; i >=3D 0; i--) { - mn =3D mas_pop_node(&mas); - MT_BUG_ON(mt, mas_allocated(&mas) !=3D i); - MT_BUG_ON(mt, !mn); - MT_BUG_ON(mt, not_empty(mn)); - mn->parent =3D ma_parent_ptr(mn); - ma_free_rcu(mn); - } - - total =3D 64; - mas_set_alloc_req(&mas, total); /* request 2 more. */ - MT_BUG_ON(mt, mas_alloc_req(&mas) !=3D total); - mas_set_err(&mas, -ENOMEM); - MT_BUG_ON(mt, !mas_nomem(&mas, GFP_KERNEL)); - for (i =3D total; i > 0; i--) { - unsigned int e =3D 0; /* expected node_count */ - - if (!MAPLE_32BIT) { - if (i >=3D 35) - e =3D i - 34; - else if (i >=3D 5) - e =3D i - 4; - else if (i >=3D 2) - e =3D i - 1; - } else { - if (i >=3D 4) - e =3D i - 3; - else if (i >=3D 1) - e =3D i - 1; - else - e =3D 0; - } - - MT_BUG_ON(mt, mas.alloc->node_count !=3D e); - mn =3D mas_pop_node(&mas); - MT_BUG_ON(mt, not_empty(mn)); - MT_BUG_ON(mt, mas_allocated(&mas) !=3D i - 1); - MT_BUG_ON(mt, !mn); - mn->parent =3D ma_parent_ptr(mn); - ma_free_rcu(mn); - } - - total =3D 100; - for (i =3D 1; i < total; i++) { - mas_set_alloc_req(&mas, i); - mas_set_err(&mas, -ENOMEM); - MT_BUG_ON(mt, !mas_nomem(&mas, GFP_KERNEL)); - for (j =3D i; j > 0; j--) { - mn =3D mas_pop_node(&mas); - MT_BUG_ON(mt, mas_allocated(&mas) !=3D j - 1); - MT_BUG_ON(mt, !mn); - MT_BUG_ON(mt, not_empty(mn)); - mas_push_node(&mas, mn); - MT_BUG_ON(mt, mas_allocated(&mas) !=3D j); - mn =3D mas_pop_node(&mas); - MT_BUG_ON(mt, not_empty(mn)); - MT_BUG_ON(mt, mas_allocated(&mas) !=3D j - 1); - mn->parent =3D ma_parent_ptr(mn); - ma_free_rcu(mn); - } - MT_BUG_ON(mt, mas_allocated(&mas) !=3D 0); - - mas_set_alloc_req(&mas, i); - mas_set_err(&mas, -ENOMEM); - MT_BUG_ON(mt, !mas_nomem(&mas, GFP_KERNEL)); - for (j =3D 0; j <=3D i/2; j++) { - MT_BUG_ON(mt, mas_allocated(&mas) !=3D i - j); - nodes[j] =3D mas_pop_node(&mas); - MT_BUG_ON(mt, mas_allocated(&mas) !=3D i - j - 1); - } - - while (j) { - j--; - mas_push_node(&mas, nodes[j]); - MT_BUG_ON(mt, mas_allocated(&mas) !=3D i - j); - } - MT_BUG_ON(mt, mas_allocated(&mas) !=3D i); - for (j =3D 0; j <=3D i/2; j++) { - MT_BUG_ON(mt, mas_allocated(&mas) !=3D i - j); - mn =3D mas_pop_node(&mas); - MT_BUG_ON(mt, not_empty(mn)); - mn->parent =3D ma_parent_ptr(mn); - ma_free_rcu(mn); - MT_BUG_ON(mt, mas_allocated(&mas) !=3D i - j - 1); - } - mas_reset(&mas); - MT_BUG_ON(mt, mas_nomem(&mas, GFP_KERNEL)); - mas_destroy(&mas); - - } - - /* Set allocation request. */ - total =3D 500; - mas_node_count(&mas, total); - /* Drop the lock and allocate the nodes. */ - mas_nomem(&mas, GFP_KERNEL); - MT_BUG_ON(mt, !mas.alloc); - i =3D 1; - smn =3D mas.alloc; - while (i < total) { - for (j =3D 0; j < MAPLE_ALLOC_SLOTS; j++) { - i++; - MT_BUG_ON(mt, !smn->slot[j]); - if (i =3D=3D total) - break; - } - smn =3D smn->slot[0]; /* next. */ - } - MT_BUG_ON(mt, mas_allocated(&mas) !=3D total); - mas_reset(&mas); - mas_destroy(&mas); /* Free. */ - - MT_BUG_ON(mt, mas_allocated(&mas) !=3D 0); - for (i =3D 1; i < 128; i++) { - mas_node_count(&mas, i); /* Request */ - mas_nomem(&mas, GFP_KERNEL); /* Fill request */ - MT_BUG_ON(mt, mas_allocated(&mas) !=3D i); /* check request filled */ - for (j =3D i; j > 0; j--) { /*Free the requests */ - mn =3D mas_pop_node(&mas); /* get the next node. */ - MT_BUG_ON(mt, mn =3D=3D NULL); - MT_BUG_ON(mt, not_empty(mn)); - mn->parent =3D ma_parent_ptr(mn); - ma_free_rcu(mn); - } - MT_BUG_ON(mt, mas_allocated(&mas) !=3D 0); - } - - for (i =3D 1; i < MAPLE_NODE_MASK + 1; i++) { - MA_STATE(mas2, mt, 0, 0); - mas_node_count(&mas, i); /* Request */ - mas_nomem(&mas, GFP_KERNEL); /* Fill request */ - MT_BUG_ON(mt, mas_allocated(&mas) !=3D i); /* check request filled */ - for (j =3D 1; j <=3D i; j++) { /* Move the allocations to mas2 */ - mn =3D mas_pop_node(&mas); /* get the next node. */ - MT_BUG_ON(mt, mn =3D=3D NULL); - MT_BUG_ON(mt, not_empty(mn)); - mas_push_node(&mas2, mn); - MT_BUG_ON(mt, mas_allocated(&mas2) !=3D j); - } - MT_BUG_ON(mt, mas_allocated(&mas) !=3D 0); - MT_BUG_ON(mt, mas_allocated(&mas2) !=3D i); - - for (j =3D i; j > 0; j--) { /*Free the requests */ - MT_BUG_ON(mt, mas_allocated(&mas2) !=3D j); - mn =3D mas_pop_node(&mas2); /* get the next node. */ - MT_BUG_ON(mt, mn =3D=3D NULL); - MT_BUG_ON(mt, not_empty(mn)); - mn->parent =3D ma_parent_ptr(mn); - ma_free_rcu(mn); - } - MT_BUG_ON(mt, mas_allocated(&mas2) !=3D 0); - } - - - MT_BUG_ON(mt, mas_allocated(&mas) !=3D 0); - mas_node_count(&mas, MAPLE_ALLOC_SLOTS + 1); /* Request */ - MT_BUG_ON(mt, mas.node !=3D MA_ERROR(-ENOMEM)); - MT_BUG_ON(mt, !mas_nomem(&mas, GFP_KERNEL)); - MT_BUG_ON(mt, mas_allocated(&mas) !=3D MAPLE_ALLOC_SLOTS + 1); - MT_BUG_ON(mt, mas.alloc->node_count !=3D MAPLE_ALLOC_SLOTS); - - mn =3D mas_pop_node(&mas); /* get the next node. */ - MT_BUG_ON(mt, mn =3D=3D NULL); - MT_BUG_ON(mt, not_empty(mn)); - MT_BUG_ON(mt, mas_allocated(&mas) !=3D MAPLE_ALLOC_SLOTS); - MT_BUG_ON(mt, mas.alloc->node_count !=3D MAPLE_ALLOC_SLOTS - 1); - - mas_push_node(&mas, mn); - MT_BUG_ON(mt, mas_allocated(&mas) !=3D MAPLE_ALLOC_SLOTS + 1); - MT_BUG_ON(mt, mas.alloc->node_count !=3D MAPLE_ALLOC_SLOTS); - - /* Check the limit of pop/push/pop */ - mas_node_count(&mas, MAPLE_ALLOC_SLOTS + 2); /* Request */ - MT_BUG_ON(mt, mas_alloc_req(&mas) !=3D 1); - MT_BUG_ON(mt, mas.node !=3D MA_ERROR(-ENOMEM)); - MT_BUG_ON(mt, !mas_nomem(&mas, GFP_KERNEL)); - MT_BUG_ON(mt, mas_alloc_req(&mas)); - MT_BUG_ON(mt, mas.alloc->node_count !=3D 1); - MT_BUG_ON(mt, mas_allocated(&mas) !=3D MAPLE_ALLOC_SLOTS + 2); - mn =3D mas_pop_node(&mas); - MT_BUG_ON(mt, not_empty(mn)); - MT_BUG_ON(mt, mas_allocated(&mas) !=3D MAPLE_ALLOC_SLOTS + 1); - MT_BUG_ON(mt, mas.alloc->node_count !=3D MAPLE_ALLOC_SLOTS); - mas_push_node(&mas, mn); - MT_BUG_ON(mt, mas.alloc->node_count !=3D 1); - MT_BUG_ON(mt, mas_allocated(&mas) !=3D MAPLE_ALLOC_SLOTS + 2); - mn =3D mas_pop_node(&mas); - MT_BUG_ON(mt, not_empty(mn)); - mn->parent =3D ma_parent_ptr(mn); - ma_free_rcu(mn); - for (i =3D 1; i <=3D MAPLE_ALLOC_SLOTS + 1; i++) { - mn =3D mas_pop_node(&mas); - MT_BUG_ON(mt, not_empty(mn)); - mn->parent =3D ma_parent_ptr(mn); - ma_free_rcu(mn); - } - MT_BUG_ON(mt, mas_allocated(&mas) !=3D 0); - - - for (i =3D 3; i < MAPLE_NODE_MASK * 3; i++) { - mas.node =3D MA_ERROR(-ENOMEM); - mas_node_count(&mas, i); /* Request */ - mas_nomem(&mas, GFP_KERNEL); /* Fill request */ - mn =3D mas_pop_node(&mas); /* get the next node. */ - mas_push_node(&mas, mn); /* put it back */ - mas_destroy(&mas); - - mas.node =3D MA_ERROR(-ENOMEM); - mas_node_count(&mas, i); /* Request */ - mas_nomem(&mas, GFP_KERNEL); /* Fill request */ - mn =3D mas_pop_node(&mas); /* get the next node. */ - mn2 =3D mas_pop_node(&mas); /* get the next node. */ - mas_push_node(&mas, mn); /* put them back */ - mas_push_node(&mas, mn2); - mas_destroy(&mas); - - mas.node =3D MA_ERROR(-ENOMEM); - mas_node_count(&mas, i); /* Request */ - mas_nomem(&mas, GFP_KERNEL); /* Fill request */ - mn =3D mas_pop_node(&mas); /* get the next node. */ - mn2 =3D mas_pop_node(&mas); /* get the next node. */ - mn3 =3D mas_pop_node(&mas); /* get the next node. */ - mas_push_node(&mas, mn); /* put them back */ - mas_push_node(&mas, mn2); - mas_push_node(&mas, mn3); - mas_destroy(&mas); - - mas.node =3D MA_ERROR(-ENOMEM); - mas_node_count(&mas, i); /* Request */ - mas_nomem(&mas, GFP_KERNEL); /* Fill request */ - mn =3D mas_pop_node(&mas); /* get the next node. */ - mn->parent =3D ma_parent_ptr(mn); - ma_free_rcu(mn); - mas_destroy(&mas); - - mas.node =3D MA_ERROR(-ENOMEM); - mas_node_count(&mas, i); /* Request */ - mas_nomem(&mas, GFP_KERNEL); /* Fill request */ - mn =3D mas_pop_node(&mas); /* get the next node. */ - mn->parent =3D ma_parent_ptr(mn); - ma_free_rcu(mn); - mn =3D mas_pop_node(&mas); /* get the next node. */ - mn->parent =3D ma_parent_ptr(mn); - ma_free_rcu(mn); - mn =3D mas_pop_node(&mas); /* get the next node. */ - mn->parent =3D ma_parent_ptr(mn); - ma_free_rcu(mn); - mas_destroy(&mas); - } - - mas.node =3D MA_ERROR(-ENOMEM); - mas_node_count(&mas, 5); /* Request */ - mas_nomem(&mas, GFP_KERNEL); /* Fill request */ - MT_BUG_ON(mt, mas_allocated(&mas) !=3D 5); - mas.node =3D MA_ERROR(-ENOMEM); - mas_node_count(&mas, 10); /* Request */ - mas_nomem(&mas, GFP_KERNEL); /* Fill request */ - mas.status =3D ma_start; - MT_BUG_ON(mt, mas_allocated(&mas) !=3D 10); - mas_destroy(&mas); - - mas.node =3D MA_ERROR(-ENOMEM); - mas_node_count(&mas, MAPLE_ALLOC_SLOTS - 1); /* Request */ - mas_nomem(&mas, GFP_KERNEL); /* Fill request */ - MT_BUG_ON(mt, mas_allocated(&mas) !=3D MAPLE_ALLOC_SLOTS - 1); - mas.node =3D MA_ERROR(-ENOMEM); - mas_node_count(&mas, 10 + MAPLE_ALLOC_SLOTS - 1); /* Request */ - mas_nomem(&mas, GFP_KERNEL); /* Fill request */ - mas.status =3D ma_start; - MT_BUG_ON(mt, mas_allocated(&mas) !=3D 10 + MAPLE_ALLOC_SLOTS - 1); - mas_destroy(&mas); - - mas.node =3D MA_ERROR(-ENOMEM); - mas_node_count(&mas, MAPLE_ALLOC_SLOTS + 1); /* Request */ - mas_nomem(&mas, GFP_KERNEL); /* Fill request */ - MT_BUG_ON(mt, mas_allocated(&mas) !=3D MAPLE_ALLOC_SLOTS + 1); - mas.node =3D MA_ERROR(-ENOMEM); - mas_node_count(&mas, MAPLE_ALLOC_SLOTS * 2 + 2); /* Request */ - mas_nomem(&mas, GFP_KERNEL); /* Fill request */ - mas.status =3D ma_start; - MT_BUG_ON(mt, mas_allocated(&mas) !=3D MAPLE_ALLOC_SLOTS * 2 + 2); - mas_destroy(&mas); - - mas.node =3D MA_ERROR(-ENOMEM); - mas_node_count(&mas, MAPLE_ALLOC_SLOTS * 2 + 1); /* Request */ - mas_nomem(&mas, GFP_KERNEL); /* Fill request */ - MT_BUG_ON(mt, mas_allocated(&mas) !=3D MAPLE_ALLOC_SLOTS * 2 + 1); - mas.node =3D MA_ERROR(-ENOMEM); - mas_node_count(&mas, MAPLE_ALLOC_SLOTS * 3 + 2); /* Request */ - mas_nomem(&mas, GFP_KERNEL); /* Fill request */ - mas.status =3D ma_start; - MT_BUG_ON(mt, mas_allocated(&mas) !=3D MAPLE_ALLOC_SLOTS * 3 + 2); - mas_destroy(&mas); - - mtree_unlock(mt); -} - /* * Check erasing including RCU. */ @@ -35452,8 +35028,7 @@ static void check_dfs_preorder(struct maple_tree *m= t) mt_init_flags(mt, MT_FLAGS_ALLOC_RANGE); mas_reset(&mas); mt_zero_nr_tallocated(); - mt_set_non_kernel(200); - mas_expected_entries(&mas, max); + mt_set_non_kernel(1000); for (count =3D 0; count <=3D max; count++) { mas.index =3D mas.last =3D count; mas_store(&mas, xa_mk_value(count)); @@ -35518,6 +35093,13 @@ static unsigned char get_vacant_height(struct ma_w= r_state *wr_mas, void *entry) return vacant_height; } =20 +static int mas_allocated(struct ma_state *mas) +{ + if (mas->sheaf) + return kmem_cache_sheaf_size(mas->sheaf); + + return 0; +} /* Preallocation testing */ static noinline void __init check_prealloc(struct maple_tree *mt) { @@ -35536,7 +35118,10 @@ static noinline void __init check_prealloc(struct = maple_tree *mt) =20 /* Spanning store */ mas_set_range(&mas, 470, 500); - MT_BUG_ON(mt, mas_preallocate(&mas, ptr, GFP_KERNEL) !=3D 0); + + mas_wr_preallocate(&wr_mas, ptr); + MT_BUG_ON(mt, mas.store_type !=3D wr_spanning_store); + MT_BUG_ON(mt, mas_is_err(&mas)); allocated =3D mas_allocated(&mas); height =3D mas_mt_height(&mas); vacant_height =3D get_vacant_height(&wr_mas, ptr); @@ -35546,6 +35131,7 @@ static noinline void __init check_prealloc(struct m= aple_tree *mt) allocated =3D mas_allocated(&mas); MT_BUG_ON(mt, allocated !=3D 0); =20 + mas_wr_preallocate(&wr_mas, ptr); MT_BUG_ON(mt, mas_preallocate(&mas, ptr, GFP_KERNEL) !=3D 0); allocated =3D mas_allocated(&mas); height =3D mas_mt_height(&mas); @@ -35586,20 +35172,6 @@ static noinline void __init check_prealloc(struct = maple_tree *mt) mn->parent =3D ma_parent_ptr(mn); ma_free_rcu(mn); =20 - MT_BUG_ON(mt, mas_preallocate(&mas, ptr, GFP_KERNEL) !=3D 0); - allocated =3D mas_allocated(&mas); - height =3D mas_mt_height(&mas); - vacant_height =3D get_vacant_height(&wr_mas, ptr); - MT_BUG_ON(mt, allocated !=3D 1 + (height - vacant_height) * 3); - mn =3D mas_pop_node(&mas); - MT_BUG_ON(mt, mas_allocated(&mas) !=3D allocated - 1); - mas_push_node(&mas, mn); - MT_BUG_ON(mt, mas_allocated(&mas) !=3D allocated); - MT_BUG_ON(mt, mas_preallocate(&mas, ptr, GFP_KERNEL) !=3D 0); - mas_destroy(&mas); - allocated =3D mas_allocated(&mas); - MT_BUG_ON(mt, allocated !=3D 0); - MT_BUG_ON(mt, mas_preallocate(&mas, ptr, GFP_KERNEL) !=3D 0); allocated =3D mas_allocated(&mas); height =3D mas_mt_height(&mas); @@ -36400,11 +35972,17 @@ static void check_nomem_writer_race(struct maple_= tree *mt) check_load(mt, 6, xa_mk_value(0xC)); mtree_unlock(mt); =20 + mt_set_non_kernel(0); /* test for the same race but with mas_store_gfp() */ mtree_store_range(mt, 0, 5, xa_mk_value(0xA), GFP_KERNEL); mtree_store_range(mt, 6, 10, NULL, GFP_KERNEL); =20 mas_set_range(&mas, 0, 5); + + /* setup writer 2 that will trigger the race condition */ + mt_set_private(mt); + mt_set_callback(writer2); + mtree_lock(mt); mas_store_gfp(&mas, NULL, GFP_KERNEL); =20 @@ -36546,10 +36124,6 @@ void farmer_tests(void) check_erase_testset(&tree); mtree_destroy(&tree); =20 - mt_init_flags(&tree, 0); - check_new_node(&tree); - mtree_destroy(&tree); - if (!MAPLE_32BIT) { mt_init_flags(&tree, MT_FLAGS_ALLOC_RANGE); check_rcu_simulated(&tree); diff --git a/tools/testing/shared/linux.c b/tools/testing/shared/linux.c index 4ceff7969b78cf8e33cd1e021c68bc9f8a02a7a1..8c72571559583759456c2b469a2= abc2611117c13 100644 --- a/tools/testing/shared/linux.c +++ b/tools/testing/shared/linux.c @@ -64,7 +64,8 @@ void *kmem_cache_alloc_lru(struct kmem_cache *cachep, str= uct list_lru *lru, =20 if (!(gfp & __GFP_DIRECT_RECLAIM)) { if (!cachep->non_kernel) { - cachep->exec_callback =3D true; + if (cachep->callback) + cachep->exec_callback =3D true; return NULL; } =20 @@ -210,6 +211,8 @@ int kmem_cache_alloc_bulk(struct kmem_cache *cachep, gf= p_t gfp, size_t size, for (i =3D 0; i < size; i++) __kmem_cache_free_locked(cachep, p[i]); pthread_mutex_unlock(&cachep->lock); + if (cachep->callback) + cachep->exec_callback =3D true; return 0; } =20 --=20 2.51.0 From nobody Fri Oct 3 07:42:42 2025 Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.223.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EF92A307494 for ; Wed, 3 Sep 2025 13:01:17 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=195.135.223.131 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1756904479; cv=none; b=gvXGSCQANooYrrw7BGb4RuInXdMBt+QkoT70aRO/umA2+AioqXk50uW+9ttBAOPluxszBQ/yPT4N7Swayy4s9GS3W9fh3+fmnAMfwNKF2WOOsVHWgzwLPOufuH6/q+vV0+F+rD/XP8bqhnzxAIq4V879xOHG5crlUjdYIcIApzk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1756904479; c=relaxed/simple; bh=m0afrnvbCJmotkoLxdipULPk1ZTXta9HWdHj/XMQhT8=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=CcjgpHK/fOmfGKg5x+4MxqfesqJEHz/rfd39o4HXNDPKQMrq7ieSsW8O3e8dnoa2XhAqtnvGiHEyPOCIJEBgvkqkn4cKY85mJN+hc3X4Z1IDed8PGYPYj47X4R87TWbvkSvQhMDdRgsCc+NhWdk/jw7dOINDX3tM1anGZiiOXO0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=suse.cz; spf=pass smtp.mailfrom=suse.cz; dkim=pass (1024-bit key) header.d=suse.cz header.i=@suse.cz header.b=y6+RAcBv; dkim=permerror (0-bit key) header.d=suse.cz header.i=@suse.cz header.b=dfhViHJX; dkim=pass (1024-bit key) header.d=suse.cz header.i=@suse.cz header.b=y6+RAcBv; dkim=permerror (0-bit key) header.d=suse.cz header.i=@suse.cz header.b=dfhViHJX; arc=none smtp.client-ip=195.135.223.131 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=suse.cz Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=suse.cz Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=suse.cz header.i=@suse.cz header.b="y6+RAcBv"; dkim=permerror (0-bit key) header.d=suse.cz header.i=@suse.cz header.b="dfhViHJX"; dkim=pass (1024-bit key) header.d=suse.cz header.i=@suse.cz header.b="y6+RAcBv"; dkim=permerror (0-bit key) header.d=suse.cz header.i=@suse.cz header.b="dfhViHJX" Received: from imap1.dmz-prg2.suse.org (unknown [10.150.64.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id D87091F74A; Wed, 3 Sep 2025 13:00:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1756904410; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=IrB03Yj54aC/gQ5+QvFtF4NcYsIp0gKGVOi118eDKlc=; b=y6+RAcBveqgVH58KSQ/NMm/qDNxXyyc8ByT/YIBJStlUT3859sylN36mLujFWsuT6VwnqO 1j+mIT1saneq/MLIH8wDiHJsxS3DodN27LY2/VNa9vRgmSyODWnN3A7nwI3ApiYk+iuXR7 l7WvpVhznrxDvNWVLskJlc1ZHTg31iE= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1756904410; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=IrB03Yj54aC/gQ5+QvFtF4NcYsIp0gKGVOi118eDKlc=; b=dfhViHJXsE8d4aJj5bRKt9+6Uq1SXGO8ZDilqubh83FqCht6xzqfFrQcrkIUxIY5PKtqjD updr/HW3IynRyBBA== Authentication-Results: smtp-out2.suse.de; none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1756904410; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=IrB03Yj54aC/gQ5+QvFtF4NcYsIp0gKGVOi118eDKlc=; b=y6+RAcBveqgVH58KSQ/NMm/qDNxXyyc8ByT/YIBJStlUT3859sylN36mLujFWsuT6VwnqO 1j+mIT1saneq/MLIH8wDiHJsxS3DodN27LY2/VNa9vRgmSyODWnN3A7nwI3ApiYk+iuXR7 l7WvpVhznrxDvNWVLskJlc1ZHTg31iE= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1756904410; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=IrB03Yj54aC/gQ5+QvFtF4NcYsIp0gKGVOi118eDKlc=; b=dfhViHJXsE8d4aJj5bRKt9+6Uq1SXGO8ZDilqubh83FqCht6xzqfFrQcrkIUxIY5PKtqjD updr/HW3IynRyBBA== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id BB78213B0A; Wed, 3 Sep 2025 13:00:10 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id eOt9Ldo7uGitOAAAD6G6ig (envelope-from ); Wed, 03 Sep 2025 13:00:10 +0000 From: Vlastimil Babka Date: Wed, 03 Sep 2025 15:00:02 +0200 Subject: [PATCH v7 20/21] maple_tree: Add single node allocation support to maple state Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20250903-slub-percpu-caches-v7-20-71c114cdefef@suse.cz> References: <20250903-slub-percpu-caches-v7-0-71c114cdefef@suse.cz> In-Reply-To: <20250903-slub-percpu-caches-v7-0-71c114cdefef@suse.cz> To: Suren Baghdasaryan , "Liam R. Howlett" , Christoph Lameter , David Rientjes Cc: Roman Gushchin , Harry Yoo , Uladzislau Rezki , Sidhartha Kumar , linux-mm@kvack.org, linux-kernel@vger.kernel.org, rcu@vger.kernel.org, maple-tree@lists.infradead.org, vbabka@suse.cz, "Liam R. Howlett" X-Mailer: b4 0.14.2 X-Spamd-Result: default: False [-4.30 / 50.00]; BAYES_HAM(-3.00)[100.00%]; NEURAL_HAM_LONG(-1.00)[-1.000]; NEURAL_HAM_SHORT(-0.20)[-0.999]; MIME_GOOD(-0.10)[text/plain]; RCVD_VIA_SMTP_AUTH(0.00)[]; ARC_NA(0.00)[]; FREEMAIL_ENVRCPT(0.00)[gmail.com]; MIME_TRACE(0.00)[0:+]; FUZZY_RATELIMITED(0.00)[rspamd.com]; TO_DN_SOME(0.00)[]; RCPT_COUNT_TWELVE(0.00)[14]; RCVD_TLS_ALL(0.00)[]; MID_RHS_MATCH_FROM(0.00)[]; DKIM_SIGNED(0.00)[suse.cz:s=susede2_rsa,suse.cz:s=susede2_ed25519]; FROM_HAS_DN(0.00)[]; FREEMAIL_CC(0.00)[linux.dev,oracle.com,gmail.com,kvack.org,vger.kernel.org,lists.infradead.org,suse.cz,Oracle.com]; R_RATELIMIT(0.00)[to_ip_from(RLwn5r54y1cp81no5tmbbew5oc)]; FROM_EQ_ENVFROM(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; RCVD_COUNT_TWO(0.00)[2]; DBL_BLOCKED_OPENRESOLVER(0.00)[suse.cz:mid,suse.cz:email,imap1.dmz-prg2.suse.org:helo,oracle.com:email] X-Spam-Flag: NO X-Spam-Level: X-Spam-Score: -4.30 From: "Liam R. Howlett" The fast path through a write will require replacing a single node in the tree. Using a sheaf (32 nodes) is too heavy for the fast path, so special case the node store operation by just allocating one node in the maple state. Signed-off-by: Liam R. Howlett Signed-off-by: Vlastimil Babka --- include/linux/maple_tree.h | 4 +++- lib/maple_tree.c | 47 +++++++++++++++++++++++++++++++++++-= ---- tools/testing/radix-tree/maple.c | 9 ++++++-- 3 files changed, 51 insertions(+), 9 deletions(-) diff --git a/include/linux/maple_tree.h b/include/linux/maple_tree.h index 166fd67e00d882b1e6de1f80c1b590bba7497cd3..562a1e9e5132b5b1fa8f8402a7c= add8abb65e323 100644 --- a/include/linux/maple_tree.h +++ b/include/linux/maple_tree.h @@ -443,6 +443,7 @@ struct ma_state { unsigned long min; /* The minimum index of this node - implied pivot min= */ unsigned long max; /* The maximum index of this node - implied pivot max= */ struct slab_sheaf *sheaf; /* Allocated nodes for this operation */ + struct maple_node *alloc; /* allocated nodes */ unsigned long node_request; enum maple_status status; /* The status of the state (active, start, none= , etc) */ unsigned char depth; /* depth of tree descent during write */ @@ -491,8 +492,9 @@ struct ma_wr_state { .status =3D ma_start, \ .min =3D 0, \ .max =3D ULONG_MAX, \ - .node_request=3D 0, \ .sheaf =3D NULL, \ + .alloc =3D NULL, \ + .node_request=3D 0, \ .mas_flags =3D 0, \ .store_type =3D wr_invalid, \ } diff --git a/lib/maple_tree.c b/lib/maple_tree.c index 3f14bfa7fe1c20aac3e127f0defd268c3dbca6aa..f2b81a8d840b219819dd10104cd= 25120ba913ec7 100644 --- a/lib/maple_tree.c +++ b/lib/maple_tree.c @@ -1095,16 +1095,23 @@ static int mas_ascend(struct ma_state *mas) * * Return: A pointer to a maple node. */ -static inline struct maple_node *mas_pop_node(struct ma_state *mas) +static __always_inline struct maple_node *mas_pop_node(struct ma_state *ma= s) { struct maple_node *ret; =20 + if (mas->alloc) { + ret =3D mas->alloc; + mas->alloc =3D NULL; + goto out; + } + if (WARN_ON_ONCE(!mas->sheaf)) return NULL; =20 ret =3D kmem_cache_alloc_from_sheaf(maple_node_cache, GFP_NOWAIT, mas->sh= eaf); - memset(ret, 0, sizeof(*ret)); =20 +out: + memset(ret, 0, sizeof(*ret)); return ret; } =20 @@ -1115,9 +1122,34 @@ static inline struct maple_node *mas_pop_node(struct= ma_state *mas) */ static inline void mas_alloc_nodes(struct ma_state *mas, gfp_t gfp) { - if (unlikely(mas->sheaf)) { - unsigned long refill =3D mas->node_request; + if (!mas->node_request) + return; + + if (mas->node_request =3D=3D 1) { + if (mas->sheaf) + goto use_sheaf; + + if (mas->alloc) + return; =20 + mas->alloc =3D mt_alloc_one(gfp); + if (!mas->alloc) + goto error; + + mas->node_request =3D 0; + return; + } + +use_sheaf: + if (unlikely(mas->alloc)) { + kfree(mas->alloc); + mas->alloc =3D NULL; + } + + if (mas->sheaf) { + unsigned long refill; + + refill =3D mas->node_request; if(kmem_cache_sheaf_size(mas->sheaf) >=3D refill) { mas->node_request =3D 0; return; @@ -5380,8 +5412,11 @@ void mas_destroy(struct ma_state *mas) mas->node_request =3D 0; if (mas->sheaf) mt_return_sheaf(mas->sheaf); - mas->sheaf =3D NULL; + + if (mas->alloc) + kfree(mas->alloc); + mas->alloc =3D NULL; } EXPORT_SYMBOL_GPL(mas_destroy); =20 @@ -6079,7 +6114,7 @@ bool mas_nomem(struct ma_state *mas, gfp_t gfp) mas_alloc_nodes(mas, gfp); } =20 - if (!mas->sheaf) + if (!mas->sheaf && !mas->alloc) return false; =20 mas->status =3D ma_start; diff --git a/tools/testing/radix-tree/maple.c b/tools/testing/radix-tree/ma= ple.c index da3e03d73b52162dab6fa5c368ad7b71b9e58521..89da991e12cd97e44971757ddc1= 05ef46c68ea4c 100644 --- a/tools/testing/radix-tree/maple.c +++ b/tools/testing/radix-tree/maple.c @@ -35095,10 +35095,15 @@ static unsigned char get_vacant_height(struct ma_= wr_state *wr_mas, void *entry) =20 static int mas_allocated(struct ma_state *mas) { + int total =3D 0; + + if (mas->alloc) + total++; + if (mas->sheaf) - return kmem_cache_sheaf_size(mas->sheaf); + total +=3D kmem_cache_sheaf_size(mas->sheaf); =20 - return 0; + return total; } /* Preallocation testing */ static noinline void __init check_prealloc(struct maple_tree *mt) --=20 2.51.0 From nobody Fri Oct 3 07:42:42 2025 Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.223.130]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A0CC13081AE for ; Wed, 3 Sep 2025 13:01:38 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=195.135.223.130 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1756904500; cv=none; b=gXH1jOKoTSdcObvUdGwrTSQHkAgieiO2ZWldn7rdA2DfkOvLgOkG79WdeTcUMl9cPIltX0Imxqjk2aqyTbZhWHCSj3/8PttAYZYbPcrVDexYdoK8sYRv4OYyNCLZ5pvZdFUcwVf2+flJ4J0516lzMcpZQJIxWhvuwbbvtIGxieg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1756904500; c=relaxed/simple; bh=sXKOtebAdp7OI7DhBdyiQlYqJN1iq1WqFWe2PLBTrdo=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=ZGecZvKmgub+O6rHxJzrdli/XrmvpyYahpEd+WL3ahDz6USDhsTTZNN4yS/hRqx4dPZxQSWC8vFgbkIeW5BmH+RBi97ln7MBruP4tXgSFJmLW9OZOEOYKYKfO8zj4dgrDEOEcv7MviZtg4g+tgnwClehuTS1IzwS73lN9FoOwsw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=suse.cz; spf=pass smtp.mailfrom=suse.cz; dkim=pass (1024-bit key) header.d=suse.cz header.i=@suse.cz header.b=pbjD0DBD; dkim=permerror (0-bit key) header.d=suse.cz header.i=@suse.cz header.b=spr1fns8; dkim=pass (1024-bit key) header.d=suse.cz header.i=@suse.cz header.b=fpMwkQEp; dkim=permerror (0-bit key) header.d=suse.cz header.i=@suse.cz header.b=+0CfIevh; arc=none smtp.client-ip=195.135.223.130 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=suse.cz Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=suse.cz Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=suse.cz header.i=@suse.cz header.b="pbjD0DBD"; dkim=permerror (0-bit key) header.d=suse.cz header.i=@suse.cz header.b="spr1fns8"; dkim=pass (1024-bit key) header.d=suse.cz header.i=@suse.cz header.b="fpMwkQEp"; dkim=permerror (0-bit key) header.d=suse.cz header.i=@suse.cz header.b="+0CfIevh" Received: from imap1.dmz-prg2.suse.org (unknown [10.150.64.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id E7EF82122F; Wed, 3 Sep 2025 13:00:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1756904411; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=yCcVGdp/KhsEZz39yl99PFemz0v+mJhBuOia0com5iU=; b=pbjD0DBDSP+X6y6paUUVdqmkZ5a/x6h4+HkbWuCEvuceCXV17FMqnuw+CrqYdLT8DlYZ7t pnRA1I/yBLqIe10G0C04ZxjSjyz9BLznFqlDeaL02MXVqGg/Uhbd5l7pXKwnO5I+3tb52W GBFgkB1ao0vNkTmb37ghR8xFB+u+mu4= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1756904411; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=yCcVGdp/KhsEZz39yl99PFemz0v+mJhBuOia0com5iU=; b=spr1fns8XDkY6mtDcaGBF4O2DIJ7W1U6L9nblQc757wCrYgwqQbDNWjIUCPKLAmf17YCK3 e0fLSj/jKFuRftDg== Authentication-Results: smtp-out1.suse.de; none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1756904410; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=yCcVGdp/KhsEZz39yl99PFemz0v+mJhBuOia0com5iU=; b=fpMwkQEp4Fd1CSGGBS/ziHHoTHOVVj3/g3fvb0Mo8AkwPXq3cDTAUwhJKDkOZS6a4FJI3s y7hlEpOb4JGbB2rkpRHMbgYeuTpKY4bRn+VLlfigoj0132hVOhhectjhD/dIsoBXu8hxUB 4M+VF9DJHWETBNTFrRJZO98NUsdkphc= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1756904410; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=yCcVGdp/KhsEZz39yl99PFemz0v+mJhBuOia0com5iU=; b=+0CfIevh4Oh3ByHQ4p+SULAyPIX5oVgNniHSh2O9xjQN5L0HW89rnkq1D8/lrWJuUUc9XY AyLDPdbVellDJiBw== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id D009213ACF; Wed, 3 Sep 2025 13:00:10 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id uESAMto7uGitOAAAD6G6ig (envelope-from ); Wed, 03 Sep 2025 13:00:10 +0000 From: Vlastimil Babka Date: Wed, 03 Sep 2025 15:00:03 +0200 Subject: [PATCH v7 21/21] maple_tree: Convert forking to use the sheaf interface Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20250903-slub-percpu-caches-v7-21-71c114cdefef@suse.cz> References: <20250903-slub-percpu-caches-v7-0-71c114cdefef@suse.cz> In-Reply-To: <20250903-slub-percpu-caches-v7-0-71c114cdefef@suse.cz> To: Suren Baghdasaryan , "Liam R. Howlett" , Christoph Lameter , David Rientjes Cc: Roman Gushchin , Harry Yoo , Uladzislau Rezki , Sidhartha Kumar , linux-mm@kvack.org, linux-kernel@vger.kernel.org, rcu@vger.kernel.org, maple-tree@lists.infradead.org, vbabka@suse.cz, "Liam R. Howlett" X-Mailer: b4 0.14.2 X-Spam-Level: X-Spamd-Result: default: False [-4.30 / 50.00]; BAYES_HAM(-3.00)[100.00%]; NEURAL_HAM_LONG(-1.00)[-1.000]; NEURAL_HAM_SHORT(-0.20)[-0.999]; MIME_GOOD(-0.10)[text/plain]; RCVD_VIA_SMTP_AUTH(0.00)[]; ARC_NA(0.00)[]; FREEMAIL_ENVRCPT(0.00)[gmail.com]; MIME_TRACE(0.00)[0:+]; FUZZY_RATELIMITED(0.00)[rspamd.com]; TO_DN_SOME(0.00)[]; RCPT_COUNT_TWELVE(0.00)[14]; RCVD_TLS_ALL(0.00)[]; MID_RHS_MATCH_FROM(0.00)[]; DKIM_SIGNED(0.00)[suse.cz:s=susede2_rsa,suse.cz:s=susede2_ed25519]; FROM_HAS_DN(0.00)[]; FREEMAIL_CC(0.00)[linux.dev,oracle.com,gmail.com,kvack.org,vger.kernel.org,lists.infradead.org,suse.cz,Oracle.com]; R_RATELIMIT(0.00)[to(RL941jgdop1fyjkq8h4),to_ip_from(RLwn5r54y1cp81no5tmbbew5oc)]; FROM_EQ_ENVFROM(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; RCVD_COUNT_TWO(0.00)[2]; DBL_BLOCKED_OPENRESOLVER(0.00)[suse.cz:email,suse.cz:mid,imap1.dmz-prg2.suse.org:helo,oracle.com:email] X-Spam-Flag: NO X-Spam-Score: -4.30 From: "Liam R. Howlett" Use the generic interface which should result in less bulk allocations during a forking. A part of this is to abstract the freeing of the sheaf or maple state allocations into its own function so mas_destroy() and the tree duplication code can use the same functionality to return any unused resources. Signed-off-by: Liam R. Howlett Reviewed-by: Suren Baghdasaryan Signed-off-by: Vlastimil Babka --- lib/maple_tree.c | 42 +++++++++++++++++++++++------------------- 1 file changed, 23 insertions(+), 19 deletions(-) diff --git a/lib/maple_tree.c b/lib/maple_tree.c index f2b81a8d840b219819dd10104cd25120ba913ec7..c185dbc90007a0046a37f194e73= e261caa1848c0 100644 --- a/lib/maple_tree.c +++ b/lib/maple_tree.c @@ -1172,6 +1172,19 @@ static inline void mas_alloc_nodes(struct ma_state *= mas, gfp_t gfp) mas_set_err(mas, -ENOMEM); } =20 +static inline void mas_empty_nodes(struct ma_state *mas) +{ + mas->node_request =3D 0; + if (mas->sheaf) { + mt_return_sheaf(mas->sheaf); + mas->sheaf =3D NULL; + } + + if (mas->alloc) { + kfree(mas->alloc); + mas->alloc =3D NULL; + } +} =20 /* * mas_free() - Free an encoded maple node @@ -5408,15 +5421,7 @@ void mas_destroy(struct ma_state *mas) mas->mas_flags &=3D ~MA_STATE_REBALANCE; } mas->mas_flags &=3D ~(MA_STATE_BULK|MA_STATE_PREALLOC); - - mas->node_request =3D 0; - if (mas->sheaf) - mt_return_sheaf(mas->sheaf); - mas->sheaf =3D NULL; - - if (mas->alloc) - kfree(mas->alloc); - mas->alloc =3D NULL; + mas_empty_nodes(mas); } EXPORT_SYMBOL_GPL(mas_destroy); =20 @@ -6504,7 +6509,7 @@ static inline void mas_dup_alloc(struct ma_state *mas= , struct ma_state *new_mas, struct maple_node *node =3D mte_to_node(mas->node); struct maple_node *new_node =3D mte_to_node(new_mas->node); enum maple_type type; - unsigned char request, count, i; + unsigned char count, i; void __rcu **slots; void __rcu **new_slots; unsigned long val; @@ -6512,20 +6517,17 @@ static inline void mas_dup_alloc(struct ma_state *m= as, struct ma_state *new_mas, /* Allocate memory for child nodes. */ type =3D mte_node_type(mas->node); new_slots =3D ma_slots(new_node, type); - request =3D mas_data_end(mas) + 1; - count =3D mt_alloc_bulk(gfp, request, (void **)new_slots); - if (unlikely(count < request)) { - memset(new_slots, 0, request * sizeof(void *)); - mas_set_err(mas, -ENOMEM); + count =3D mas->node_request =3D mas_data_end(mas) + 1; + mas_alloc_nodes(mas, gfp); + if (unlikely(mas_is_err(mas))) return; - } =20 - /* Restore node type information in slots. */ slots =3D ma_slots(node, type); for (i =3D 0; i < count; i++) { val =3D (unsigned long)mt_slot_locked(mas->tree, slots, i); val &=3D MAPLE_NODE_MASK; - ((unsigned long *)new_slots)[i] |=3D val; + new_slots[i] =3D ma_mnode_ptr((unsigned long)mas_pop_node(mas) | + val); } } =20 @@ -6579,7 +6581,7 @@ static inline void mas_dup_build(struct ma_state *mas= , struct ma_state *new_mas, /* Only allocate child nodes for non-leaf nodes. */ mas_dup_alloc(mas, new_mas, gfp); if (unlikely(mas_is_err(mas))) - return; + goto empty_mas; } else { /* * This is the last leaf node and duplication is @@ -6612,6 +6614,8 @@ static inline void mas_dup_build(struct ma_state *mas= , struct ma_state *new_mas, /* Make them the same height */ new_mas->tree->ma_flags =3D mas->tree->ma_flags; rcu_assign_pointer(new_mas->tree->ma_root, root); +empty_mas: + mas_empty_nodes(mas); } =20 /** --=20 2.51.0