From nobody Mon Feb 9 19:38:03 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8FFD5EB64DA for ; Wed, 28 Jun 2023 08:18:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234879AbjF1ISY (ORCPT ); Wed, 28 Jun 2023 04:18:24 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39252 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232656AbjF1INo (ORCPT ); Wed, 28 Jun 2023 04:13:44 -0400 Received: from mail-oi1-x235.google.com (mail-oi1-x235.google.com [IPv6:2607:f8b0:4864:20::235]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3490F4239 for ; Wed, 28 Jun 2023 01:09:38 -0700 (PDT) Received: by mail-oi1-x235.google.com with SMTP id 5614622812f47-3a36803f667so126541b6e.0 for ; Wed, 28 Jun 2023 01:09:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1687939777; x=1690531777; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=mE5S+xEF8iVKqL+efUmsUqFQSIecG35V/4BhhPZycNg=; b=AilhbHK0aM+A5BB8Z1rmPO4N+MsUR/2+LQgiCZJWFN2R11wFXOUolRw8hL8L8T4mof y5mtG8fPoQ/kIFKblMt5ceUIVNhur2Qn+h/0TBSRxWK5PIcMoDxtA/7e9HFQF8tA/TLP PoELeW1aYNmSReh/flMgBDZsXlw7UF79s0r+tfstLyIVfOMsrYZlkU/8ffNjc3MKkCPY x6A4cQN7bNxLQfVluDsG+qqxCWwJfiGmSpJF22bhYclYAJfE57DK3Kj4nFy6DwtKa0d5 1S00BaWFyUECXKoYNsz46o2qtcBolWgd7GAVfl9Pi5CnmqWO25huNDpK8CqmbKaZqya7 0BAw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687939777; x=1690531777; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=mE5S+xEF8iVKqL+efUmsUqFQSIecG35V/4BhhPZycNg=; b=CiOzveHJ3KhZQ9SM/PKs2CUZlLc2EKLiFz9Z+R27Yuyp3RNATTBR6AnKsaYNcQP4hM BpP/c9a24TlZg7LALX53f6zt0JRSCJbYBwtxAlP7VHfOsG6oDslIlB4k/CyEk/Br1iUR BofecG64oh535rrTFtd+hBwqN3wx5CDBfyZuvlA3dHfzEk45CUXjPrU2G7z84C7t/F5c btwN3vmN+3O9AdV0u8hJcNRYxyhkOs80OJUbXwqJnjolrDxTJ2tbr09OUo1/LF9WO2v4 x4JrMtCIoKilH8I+zFH4/riFq7bnRdPUJ2i2rqJ75clNotqNkgql+74W3DuvV4lEA+H/ +h3g== X-Gm-Message-State: AC+VfDzeP38btlKaKCyrX8h3ewXmMMJlJEVLQuTAdZ/nQxmlzj6u8650 6HEgloAm28se4SCa10ePOaSbqQeBXI2wGjs7uCQmaw== X-Google-Smtp-Source: ACHHUZ7wsIFlFgorUeOyA8dts5//X0YcNBChMuP62NzBrpOOf8zMBBIOHsejV1gjrmGlqvrF9RiRhQ== X-Received: by 2002:a17:902:ecc3:b0:1b3:c62d:71b7 with SMTP id a3-20020a170902ecc300b001b3c62d71b7mr992905plh.18.1687937837922; Wed, 28 Jun 2023 00:37:17 -0700 (PDT) Received: from GL4FX4PXWL.bytedance.net ([203.208.167.146]) by smtp.gmail.com with ESMTPSA id jj6-20020a170903048600b001b8021fbcd2sm4836988plb.280.2023.06.28.00.37.15 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Wed, 28 Jun 2023 00:37:17 -0700 (PDT) From: Peng Zhang To: Liam.Howlett@oracle.com Cc: akpm@linux-foundation.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, maple-tree@lists.infradead.org, Peng Zhang Subject: [PATCH v4 3/4] maple_tree: optimize mas_wr_append(), also improve duplicating VMAs Date: Wed, 28 Jun 2023 15:36:56 +0800 Message-Id: <20230628073657.75314-4-zhangpeng.00@bytedance.com> X-Mailer: git-send-email 2.37.0 (Apple Git-136) In-Reply-To: <20230628073657.75314-1-zhangpeng.00@bytedance.com> References: <20230628073657.75314-1-zhangpeng.00@bytedance.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" When the new range can be completely covered by the original last range without touching the boundaries on both sides, two new entries can be appended to the end as a fast path. We update the original last pivot at the end, and the newly appended two entries will not be accessed before this, so it is also safe in RCU mode. This is useful for sequential insertion, which is what we do in dup_mmap(). Enabling BENCH_FORK in test_maple_tree and just running bench_forking() gives the following time-consuming numbers: before: after: 17,874.83 msec 15,738.38 msec It shows about a 12% performance improvement for duplicating VMAs. Signed-off-by: Peng Zhang Reviewed-by: Liam R. Howlett --- lib/maple_tree.c | 33 ++++++++++++++++++++++----------- 1 file changed, 22 insertions(+), 11 deletions(-) diff --git a/lib/maple_tree.c b/lib/maple_tree.c index bfffbb7cab26..56b9b5be28c8 100644 --- a/lib/maple_tree.c +++ b/lib/maple_tree.c @@ -4266,10 +4266,10 @@ static inline unsigned char mas_wr_new_end(struct m= a_wr_state *wr_mas) * * Return: True if appended, false otherwise */ -static inline bool mas_wr_append(struct ma_wr_state *wr_mas) +static inline bool mas_wr_append(struct ma_wr_state *wr_mas, + unsigned char new_end) { unsigned char end =3D wr_mas->node_end; - unsigned char new_end =3D end + 1; struct ma_state *mas =3D wr_mas->mas; unsigned char node_pivots =3D mt_pivots[wr_mas->type]; =20 @@ -4281,16 +4281,27 @@ static inline bool mas_wr_append(struct ma_wr_state= *wr_mas) ma_set_meta(wr_mas->node, maple_leaf_64, 0, new_end); } =20 - if (mas->last =3D=3D wr_mas->r_max) { - /* Append to end of range */ - rcu_assign_pointer(wr_mas->slots[new_end], wr_mas->entry); - wr_mas->pivots[end] =3D mas->index - 1; - mas->offset =3D new_end; + if (new_end =3D=3D wr_mas->node_end + 1) { + if (mas->last =3D=3D wr_mas->r_max) { + /* Append to end of range */ + rcu_assign_pointer(wr_mas->slots[new_end], + wr_mas->entry); + wr_mas->pivots[end] =3D mas->index - 1; + mas->offset =3D new_end; + } else { + /* Append to start of range */ + rcu_assign_pointer(wr_mas->slots[new_end], + wr_mas->content); + wr_mas->pivots[end] =3D mas->last; + rcu_assign_pointer(wr_mas->slots[end], wr_mas->entry); + } } else { - /* Append to start of range */ + /* Append to the range without touching any boundaries. */ rcu_assign_pointer(wr_mas->slots[new_end], wr_mas->content); - wr_mas->pivots[end] =3D mas->last; - rcu_assign_pointer(wr_mas->slots[end], wr_mas->entry); + wr_mas->pivots[end + 1] =3D mas->last; + rcu_assign_pointer(wr_mas->slots[end + 1], wr_mas->entry); + wr_mas->pivots[end] =3D mas->index - 1; + mas->offset =3D end + 1; } =20 if (!wr_mas->content || !wr_mas->entry) @@ -4337,7 +4348,7 @@ static inline void mas_wr_modify(struct ma_wr_state *= wr_mas) goto slow_path; =20 /* Attempt to append */ - if (new_end =3D=3D wr_mas->node_end + 1 && mas_wr_append(wr_mas)) + if (mas_wr_append(wr_mas, new_end)) return; =20 if (new_end =3D=3D wr_mas->node_end && mas_wr_slot_store(wr_mas)) --=20 2.20.1