From nobody Sat Jun 27 22:29:07 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 404A1C433F5 for ; Fri, 18 Feb 2022 12:21:50 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235142AbiBRMWE (ORCPT ); Fri, 18 Feb 2022 07:22:04 -0500 Received: from mxb-00190b01.gslb.pphosted.com ([23.128.96.19]:48818 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235113AbiBRMVz (ORCPT ); Fri, 18 Feb 2022 07:21:55 -0500 Received: from mail-ed1-x52f.google.com (mail-ed1-x52f.google.com [IPv6:2a00:1450:4864:20::52f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1192519C1B for ; Fri, 18 Feb 2022 04:21:39 -0800 (PST) Received: by mail-ed1-x52f.google.com with SMTP id c6so12429728edk.12 for ; Fri, 18 Feb 2022 04:21:38 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=hmEc5DoCLSD2DUJvNGydFX4svqDfjOrJL7x2lrpSPLk=; b=CcKCLBaPT80Z0Ha52pupsWCnMFYXy7bgQULIdXfwDMH9jGcQoTvbLhQ9wMsEkGrUgP y5KLlTAyZVa/I6W8kzmGg5ctllfTE795LOkwk/yBXxnqUVyaEbEwS0TyMmmmF/bApkW3 JqfgOlbiYOuaouOsfd7tW1W4NndVQAwtd4xCYdp86nQ5J2uaX7Jh1226vYXxGmdRXCno SD/1NEm9WtU8Ex1H1w902lw0xa8IGuYrYaMM1hVIqutF7UM17ahpcD6BNJFHMqHf1x0z egOQh3b2d/znJRXCmaxrTdCwxJRY/z8W8f3FSvwpPrQuoQf57cpnENtcKT5nIkvJ6F21 1xTA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=hmEc5DoCLSD2DUJvNGydFX4svqDfjOrJL7x2lrpSPLk=; b=jy7CY58gIJyFxtZbqh7nDPTBBLX50NN2+jn7XCKNz8/nhRypKRrbK9MaI4WHXyL8Tz u3VigwiXp/9rsCBCZbdSgNjxhM3ZGXYfMmybMeBVv/UOjNn4U5WvA9RsB50wEp5+kO+4 4rZDJzx+noeiqeuiu9jr0Fi7n02hCeG6HITWM99QSo+H0in1WODTBSd1tmr5Fq4EUntv SDnvd5ME656diIhMOZNbnDBSKl/8eFtO+fAAJ6Ym/NTiEffTtM5I6E0At/hP+Tb3UKyC guzbBrzR2JvUGBbyasj6egvunfJ04TAdOjyD+F7SN82iekaaNjOUgpYskP6cQrNj6Mr+ u66A== X-Gm-Message-State: AOAM532hqcEpliLUrTYla8nvH5tRx7ljrzTr2WIegyqlmylnOgufVhDc JeTcYbx9Ij/BaGLwDkPn5Rs= X-Google-Smtp-Source: ABdhPJwVNqS56OqHGXAdnjOIi2ZizE0j3EbD6E/PqQm5nKLAxCraATH7mQWAOpRduhttWtSq1TRS0g== X-Received: by 2002:a50:9b12:0:b0:410:b926:d2d3 with SMTP id o18-20020a509b12000000b00410b926d2d3mr7850271edi.331.1645186897584; Fri, 18 Feb 2022 04:21:37 -0800 (PST) Received: from orion.localdomain ([93.99.228.15]) by smtp.gmail.com with ESMTPSA id 1sm2270586ejm.173.2022.02.18.04.21.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Feb 2022 04:21:35 -0800 (PST) Received: by orion.localdomain (Postfix, from userid 1003) id 435F3A7FF0; Fri, 18 Feb 2022 13:20:44 +0100 (CET) From: =?UTF-8?q?Jakub=20Mat=C4=9Bna?= To: linux-mm@kvack.org Cc: patches@lists.linux.dev, linux-kernel@vger.kernel.org, vbabka@suse.cz, mhocko@kernel.org, mgorman@techsingularity.net, willy@infradead.org, liam.howlett@oracle.com, hughd@google.com, kirill@shutemov.name, riel@surriel.com, rostedt@goodmis.org, peterz@infradead.org, =?UTF-8?q?Jakub=20Mat=C4=9Bna?= Subject: [RFC PATCH 1/4] [PATCH 1/4] mm: refactor of vma_merge() Date: Fri, 18 Feb 2022 13:20:16 +0100 Message-Id: <20220218122019.130274-2-matenajakub@gmail.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20220218122019.130274-1-matenajakub@gmail.com> References: <20220218122019.130274-1-matenajakub@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Refactor vma_merge() to make it shorter, more understandable and suitable for tracing of successful merges made possible by following patches in the series. Signed-off-by: Jakub Mat=C4=9Bna --- mm/mmap.c | 81 +++++++++++++++++++++++++++---------------------------- 1 file changed, 39 insertions(+), 42 deletions(-) diff --git a/mm/mmap.c b/mm/mmap.c index 1e8fdb0b51ed..b55e11f20571 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -1172,6 +1172,9 @@ struct vm_area_struct *vma_merge(struct mm_struct *mm, pgoff_t pglen =3D (end - addr) >> PAGE_SHIFT; struct vm_area_struct *area, *next; int err; + int merge_prev =3D 0; + int merge_both =3D 0; + int merge_next =3D 0; =20 /* * We later require that vma->vm_flags =3D=3D vm_flags, @@ -1191,65 +1194,59 @@ struct vm_area_struct *vma_merge(struct mm_struct *= mm, VM_WARN_ON(addr >=3D end); =20 /* - * Can it merge with the predecessor? + * Can we merge predecessor? */ if (prev && prev->vm_end =3D=3D addr && mpol_equal(vma_policy(prev), policy) && can_vma_merge_after(prev, vm_flags, anon_vma, file, pgoff, vm_userfaultfd_ctx, anon_name)) { - /* - * OK, it can. Can we now merge in the successor as well? - */ - if (next && end =3D=3D next->vm_start && - mpol_equal(policy, vma_policy(next)) && - can_vma_merge_before(next, vm_flags, - anon_vma, file, - pgoff+pglen, - vm_userfaultfd_ctx, anon_name) && - is_mergeable_anon_vma(prev->anon_vma, - next->anon_vma, NULL)) { - /* cases 1, 6 */ - err =3D __vma_adjust(prev, prev->vm_start, - next->vm_end, prev->vm_pgoff, NULL, - prev); - } else /* cases 2, 5, 7 */ - err =3D __vma_adjust(prev, prev->vm_start, - end, prev->vm_pgoff, NULL, prev); - if (err) - return NULL; - khugepaged_enter_vma_merge(prev, vm_flags); - return prev; + merge_prev =3D true; } - /* - * Can this new request be merged in front of next? + * Can we merge successor? */ if (next && end =3D=3D next->vm_start && mpol_equal(policy, vma_policy(next)) && can_vma_merge_before(next, vm_flags, - anon_vma, file, pgoff+pglen, - vm_userfaultfd_ctx, anon_name)) { + anon_vma, file, pgoff+pglen, + vm_userfaultfd_ctx, anon_name)) { + merge_next =3D true; + } + /* + * Can we merge both predecessor and successor? + */ + if (merge_prev && merge_next) + merge_both =3D is_mergeable_anon_vma(prev->anon_vma, next->anon_vma, NUL= L); + + if (merge_both) { /* cases 1, 6 */ + err =3D __vma_adjust(prev, prev->vm_start, + next->vm_end, prev->vm_pgoff, NULL, + prev); + area =3D prev; + } else if (merge_prev) { /* cases 2, 5, 7 */ + err =3D __vma_adjust(prev, prev->vm_start, + end, prev->vm_pgoff, NULL, prev); + area =3D prev; + } else if (merge_next) { if (prev && addr < prev->vm_end) /* case 4 */ err =3D __vma_adjust(prev, prev->vm_start, - addr, prev->vm_pgoff, NULL, next); - else { /* cases 3, 8 */ + addr, prev->vm_pgoff, NULL, next); + else /* cases 3, 8 */ err =3D __vma_adjust(area, addr, next->vm_end, - next->vm_pgoff - pglen, NULL, next); - /* - * In case 3 area is already equal to next and - * this is a noop, but in case 8 "area" has - * been removed and next was expanded over it. - */ - area =3D next; - } - if (err) - return NULL; - khugepaged_enter_vma_merge(area, vm_flags); - return area; + next->vm_pgoff - pglen, NULL, next); + area =3D next; + } else { + err =3D -1; } =20 - return NULL; + /* + * Cannot merge with predecessor or successor or error in __vma_adjust? + */ + if (err) + return NULL; + khugepaged_enter_vma_merge(area, vm_flags); + return area; } =20 /* --=20 2.34.1 From nobody Sat Jun 27 22:29:07 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3252BC433F5 for ; Fri, 18 Feb 2022 12:21:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235131AbiBRMWB (ORCPT ); Fri, 18 Feb 2022 07:22:01 -0500 Received: from mxb-00190b01.gslb.pphosted.com ([23.128.96.19]:48806 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232902AbiBRMVz (ORCPT ); Fri, 18 Feb 2022 07:21:55 -0500 Received: from mail-ej1-x62c.google.com (mail-ej1-x62c.google.com [IPv6:2a00:1450:4864:20::62c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3DAE613EB9 for ; Fri, 18 Feb 2022 04:21:38 -0800 (PST) Received: by mail-ej1-x62c.google.com with SMTP id bg10so14652324ejb.4 for ; Fri, 18 Feb 2022 04:21:38 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=+FOZfGmK9Ee9HWyxoro93sTh2E++DXV5pvaMxG0mrBk=; b=S9daUL4DkyciT4OMAz6v9FmM+BjXNqe6WzHHFwVFSUo7kCrSp5MVb1P+B8l0XItu/X SybDR9ohCIDd49fCz9uTgoVdZezshyUCpvEFEHnLskNUfgus/zPK0aYbftWi3AnBPVAc 1k3Jk0uH+ATalwUnoRGFMM9y3K0qxlHchvZv/YCdanCmJfUil1zl0nP/AQoTObBhwEs8 QG/EZwVFd36fd5fzYkCBDis5gVDT/8UNgP2VnFU1i4daSgZtp8qwCj/nGRSk183SIdEl c8DdJG3DOwXGDuTUTQNO0UF/oNamOrN5bn5Oct7cJUZdzsy/S4yx+SoxKA9zzMDceAxF LRAg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=+FOZfGmK9Ee9HWyxoro93sTh2E++DXV5pvaMxG0mrBk=; b=NN9gvnSD5IzHMqpID8zXPER/wCUA6Qo8rsuvnFDEz77D2UnwQySLSQcy+Me/lVG1Tm siIp+hnv3httqlQ3KFafEOV26hCIY7hoNBRzNwDolwyV6rV7AsOWes8ZTKipMIWs+JfI vl6lHMf45oB6SGzlPn2wKBjpoTC9aE9xC730Uw/YSom2Wuv5Lvarx2jSYwtjxz8h7au6 AvgyWriq0V/NESFCA9TM7uBogLd8Ajs19+aJOy16LSssuxiChGPghl4eJeBeO4z6Ox33 KfeeXoIJpEDBNI0DN+LKCn+1kfCdAlnGcmcLrbz0b2CcTU+N8dZlV+7pipxOE0diVYE3 rT7g== X-Gm-Message-State: AOAM533tAead0Lu5znwe8ZaDctoFGnXKgefiArVT8u4woZcTbJztvyVD aXb59yTHlFJ0yb0XIXWWDI0= X-Google-Smtp-Source: ABdhPJxLg4gyvuJjYcnBgsWo5jR7IyKPfX4GM9CvXIhx+VV6WyxqXMoi3zzVm+rcduHStchWBqty3Q== X-Received: by 2002:a17:906:1b13:b0:6ce:58d:4b78 with SMTP id o19-20020a1709061b1300b006ce058d4b78mr6324703ejg.515.1645186896729; Fri, 18 Feb 2022 04:21:36 -0800 (PST) Received: from orion.localdomain ([93.99.228.15]) by smtp.gmail.com with ESMTPSA id l8sm2141008ejp.198.2022.02.18.04.21.34 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Feb 2022 04:21:35 -0800 (PST) Received: by orion.localdomain (Postfix, from userid 1003) id 46728A7FF7; Fri, 18 Feb 2022 13:20:44 +0100 (CET) From: =?UTF-8?q?Jakub=20Mat=C4=9Bna?= To: linux-mm@kvack.org Cc: patches@lists.linux.dev, linux-kernel@vger.kernel.org, vbabka@suse.cz, mhocko@kernel.org, mgorman@techsingularity.net, willy@infradead.org, liam.howlett@oracle.com, hughd@google.com, kirill@shutemov.name, riel@surriel.com, rostedt@goodmis.org, peterz@infradead.org, =?UTF-8?q?Jakub=20Mat=C4=9Bna?= Subject: [RFC PATCH 2/4] [PATCH 2/4] mm: adjust page offset in mremap Date: Fri, 18 Feb 2022 13:20:17 +0100 Message-Id: <20220218122019.130274-3-matenajakub@gmail.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20220218122019.130274-1-matenajakub@gmail.com> References: <20220218122019.130274-1-matenajakub@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Adjust page offset of a VMA when it's moved to a new location by mremap. This is made possible for all VMAs that do not share their anonymous pages with other processes. Previously this was possible only for not yet faulted VMAs. When the page offset does not correspond to the virtual address of the anonymous VMA any merge attempt with another VMA will fail. Signed-off-by: Jakub Mat=C4=9Bna --- mm/mmap.c | 101 ++++++++++++++++++++++++++++++++++++++++++++++++++---- 1 file changed, 95 insertions(+), 6 deletions(-) diff --git a/mm/mmap.c b/mm/mmap.c index b55e11f20571..8d253b46b349 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -3224,6 +3224,91 @@ int insert_vm_struct(struct mm_struct *mm, struct vm= _area_struct *vma) return 0; } =20 +bool rbst_no_children(struct anon_vma *av, struct rb_node *node) +{ + struct anon_vma_chain *model; + struct anon_vma_chain *avc; + + if (node =3D=3D NULL) /* leaf node */ + return true; + avc =3D container_of(node, typeof(*(model)), rb); + if (avc->vma->anon_vma !=3D av) + /* + * Inequality implies avc belongs + * to a VMA of a child process + */ + return false; + return (rbst_no_children(av, node->rb_left) && + rbst_no_children(av, node->rb_right)); +} + +/* + * Check if none of the VMAs connected to the given + * anon_vma via anon_vma_chain are in child relationship + */ +bool rbt_no_children(struct anon_vma *av) +{ + struct rb_node *root_node; + + if (av =3D=3D NULL || av->degree <=3D 1) /* Higher degree might not neces= sarily imply children */ + return true; + root_node =3D av->rb_root.rb_root.rb_node; + return rbst_no_children(av, root_node); +} + +/** + * update_faulted_pgoff() - Update faulted pages of a vma + * @vma: VMA being moved + * @addr: new virtual address + * @pgoff: pointer to pgoff which is updated + * If the vma and its pages are not shared with another process, update + * the new pgoff and also update index parameter (copy of the pgoff) in + * all faulted pages. + */ +bool update_faulted_pgoff(struct vm_area_struct *vma, unsigned long addr, = pgoff_t *pgoff) +{ + unsigned long pg_iter =3D 0; + unsigned long pg_iters =3D (vma->vm_end - vma->vm_start) >> PAGE_SHIFT; + + /* 1.] Check vma is not shared with other processes */ + if (vma->anon_vma->root !=3D vma->anon_vma || !rbt_no_children(vma->anon_= vma)) + return false; + + /* 2.] Check all pages are not shared */ + for (; pg_iter < pg_iters; ++pg_iter) { + bool pages_not_shared =3D true; + unsigned long shift =3D pg_iter << PAGE_SHIFT; + struct page *phys_page =3D follow_page(vma, vma->vm_start + shift, FOLL_= GET); + + if (phys_page =3D=3D NULL) + continue; + + /* Check page is not shared with other processes */ + if (page_mapcount(phys_page) > 1) + pages_not_shared =3D false; + put_page(phys_page); + if (!pages_not_shared) + return false; + } + + /* 3.] Update index in all pages to this new pgoff */ + pg_iter =3D 0; + *pgoff =3D addr >> PAGE_SHIFT; + + for (; pg_iter < pg_iters; ++pg_iter) { + unsigned long shift =3D pg_iter << PAGE_SHIFT; + struct page *phys_page =3D follow_page(vma, vma->vm_start + shift, FOLL_= GET); + + if (phys_page =3D=3D NULL) + continue; + lock_page(phys_page); + phys_page->index =3D *pgoff + pg_iter; + unlock_page(phys_page); + put_page(phys_page); + } + return true; +} + /* * Copy the vma structure to a new location in the same mm, * prior to moving page table entries, to effect an mremap move. @@ -3237,15 +3322,19 @@ struct vm_area_struct *copy_vma(struct vm_area_stru= ct **vmap, struct mm_struct *mm =3D vma->vm_mm; struct vm_area_struct *new_vma, *prev; struct rb_node **rb_link, *rb_parent; - bool faulted_in_anon_vma =3D true; + bool anon_pgoff_updated =3D false; =20 /* - * If anonymous vma has not yet been faulted, update new pgoff + * Try to update new pgoff for anonymous vma * to match new location, to increase its chance of merging. */ - if (unlikely(vma_is_anonymous(vma) && !vma->anon_vma)) { - pgoff =3D addr >> PAGE_SHIFT; - faulted_in_anon_vma =3D false; + if (unlikely(vma_is_anonymous(vma))) { + if (!vma->anon_vma) { + pgoff =3D addr >> PAGE_SHIFT; + anon_pgoff_updated =3D true; + } else { + anon_pgoff_updated =3D update_faulted_pgoff(vma, addr, &pgoff); + } } =20 if (find_vma_links(mm, addr, addr + len, &prev, &rb_link, &rb_parent)) @@ -3271,7 +3360,7 @@ struct vm_area_struct *copy_vma(struct vm_area_struct= **vmap, * safe. It is only safe to keep the vm_pgoff * linear if there are no pages mapped yet. */ - VM_BUG_ON_VMA(faulted_in_anon_vma, new_vma); + VM_BUG_ON_VMA(!anon_pgoff_updated, new_vma); *vmap =3D vma =3D new_vma; } *need_rmap_locks =3D (new_vma->vm_pgoff <=3D vma->vm_pgoff); --=20 2.34.1 From nobody Sat Jun 27 22:29:07 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D214BC433F5 for ; Fri, 18 Feb 2022 12:21:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235161AbiBRMWL (ORCPT ); Fri, 18 Feb 2022 07:22:11 -0500 Received: from mxb-00190b01.gslb.pphosted.com ([23.128.96.19]:48884 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235122AbiBRMV5 (ORCPT ); Fri, 18 Feb 2022 07:21:57 -0500 Received: from mail-ej1-x632.google.com (mail-ej1-x632.google.com [IPv6:2a00:1450:4864:20::632]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D92091C907 for ; Fri, 18 Feb 2022 04:21:40 -0800 (PST) Received: by mail-ej1-x632.google.com with SMTP id d10so14576724eje.10 for ; Fri, 18 Feb 2022 04:21:40 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=K9wD2Tr0KWFRwLLWFyz8BogF6D34vQqJ2tLSqwNrziY=; b=f894Oja0ePJSA3RxY1LN0hdK2YQrjn1gxhMzZU+xzq6qi0xHMLoxnxvT4ddpsm2RJY HLIOBTRmGR2mUDCeXMXugR5d4VGHIOlXe7xqB0FiOY6buAJUUw7AK1Tx7CbkIQ4P6TG5 0MsFuge5LxVPFGg9z5QE6rFGWa152KtVJxJjn6ILg3xeIC/WoPZ2gUnKeHv46AT1lbtv MYSFkdPz9/sQxPcCPMXZevKjjg+hl17dURo+b/gGozQ21wgsSc1n23KNT5dmAp8UWXqv R+C5f06QDVXDAGf5MEtsYiCLrLyjJOFSO41PSKJ/OxMJeWcmVQJhcs7qkmQBVGl1vIpI 36rw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=K9wD2Tr0KWFRwLLWFyz8BogF6D34vQqJ2tLSqwNrziY=; b=GYs4DWXD/RQkB8u5GWq9PwkgMGDatoNGUNwMYQ35i8Ub4TKcXyLW6zyXikfSiPMaG8 Otn7Ns/vjkoDziJMB2LNmz04A2JilF2wzpEHFMZjIkhTB1RzjhCNVG6cZ7ik4e2sGwar jl2lwhDCDHogO5HnCu5ZVzjFuJIVD7C+HhiVWlPhXQAx3thRiV7frOUsY7eCRnarVDZH ZjgoRxDJhh759n+FNxZ1lzETNBJl30J/a3/KFAGitlbbgyL7e9pqaxNH8vMNvP6gg2E8 R6aECh8E27MTQ4Id/BS3/5HGMsr8RTD4JG2iiVYtXK6uOeDRSJsW7zcfR5ssWyZ5UWBf HOFg== X-Gm-Message-State: AOAM531XaEBcQum74ah73Hak/oIwyQewo+7S2oudNLoIh/TDLNJFQ8Ae P9i7cyMbHzm/WbeSVfBxMSc= X-Google-Smtp-Source: ABdhPJxcgIZ+10B44aMtj9wOPRGY7/4HPeI8NqgjmXec4wFs3wYKpL0ExtoiBsWf7RxBrDdDrvp5Fw== X-Received: by 2002:a17:906:3708:b0:6cf:bb34:9d2e with SMTP id d8-20020a170906370800b006cfbb349d2emr6083708ejc.665.1645186896545; Fri, 18 Feb 2022 04:21:36 -0800 (PST) Received: from orion.localdomain ([93.99.228.15]) by smtp.gmail.com with ESMTPSA id e18sm4736506edj.85.2022.02.18.04.21.34 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Feb 2022 04:21:35 -0800 (PST) Received: by orion.localdomain (Postfix, from userid 1003) id 49536A7FFD; Fri, 18 Feb 2022 13:20:44 +0100 (CET) From: =?UTF-8?q?Jakub=20Mat=C4=9Bna?= To: linux-mm@kvack.org Cc: patches@lists.linux.dev, linux-kernel@vger.kernel.org, vbabka@suse.cz, mhocko@kernel.org, mgorman@techsingularity.net, willy@infradead.org, liam.howlett@oracle.com, hughd@google.com, kirill@shutemov.name, riel@surriel.com, rostedt@goodmis.org, peterz@infradead.org, =?UTF-8?q?Jakub=20Mat=C4=9Bna?= Subject: [RFC PATCH 3/4] [PATCH 3/4] mm: enable merging of VMAs with different anon_vmas Date: Fri, 18 Feb 2022 13:20:18 +0100 Message-Id: <20220218122019.130274-4-matenajakub@gmail.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20220218122019.130274-1-matenajakub@gmail.com> References: <20220218122019.130274-1-matenajakub@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Enable merging of a VMA even when it is linked to different anon_vma than the one it is being merged to, but only if the VMA in question does not share any page with a parent or child process. Every anonymous page stores a pointer to its anon_vma in the parameter mapping, which is now updated as part of the merge process. Signed-off-by: Jakub Mat=C4=9Bna --- include/linux/rmap.h | 17 ++++++++++++++++- mm/mmap.c | 15 ++++++++++++++- mm/rmap.c | 40 ++++++++++++++++++++++++++++++++++++++++ 3 files changed, 70 insertions(+), 2 deletions(-) diff --git a/include/linux/rmap.h b/include/linux/rmap.h index e704b1a4c06c..c8508a4ebc46 100644 --- a/include/linux/rmap.h +++ b/include/linux/rmap.h @@ -137,10 +137,13 @@ static inline void anon_vma_unlock_read(struct anon_v= ma *anon_vma) */ void anon_vma_init(void); /* create anon_vma_cachep */ int __anon_vma_prepare(struct vm_area_struct *); +void reconnect_pages(struct vm_area_struct *vma, struct vm_area_struct *ne= xt); void unlink_anon_vmas(struct vm_area_struct *); int anon_vma_clone(struct vm_area_struct *, struct vm_area_struct *); int anon_vma_fork(struct vm_area_struct *, struct vm_area_struct *); =20 +bool rbt_no_children(struct anon_vma *av); + static inline int anon_vma_prepare(struct vm_area_struct *vma) { if (likely(vma->anon_vma)) @@ -149,10 +152,22 @@ static inline int anon_vma_prepare(struct vm_area_str= uct *vma) return __anon_vma_prepare(vma); } =20 +/** + * anon_vma_merge() - Merge anon_vmas of the given VMAs + * @vma: VMA being merged to + * @next: VMA being merged + */ static inline void anon_vma_merge(struct vm_area_struct *vma, struct vm_area_struct *next) { - VM_BUG_ON_VMA(vma->anon_vma !=3D next->anon_vma, vma); + struct anon_vma *anon_vma1 =3D vma->anon_vma; + struct anon_vma *anon_vma2 =3D next->anon_vma; + + VM_BUG_ON_VMA(anon_vma1 && anon_vma2 && anon_vma1 !=3D anon_vma2 && + ((anon_vma2 !=3D anon_vma2->root) + || !rbt_no_children(anon_vma2)), vma); + + reconnect_pages(vma, next); unlink_anon_vmas(next); } =20 diff --git a/mm/mmap.c b/mm/mmap.c index 8d253b46b349..ed91d0cd2111 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -1065,7 +1065,20 @@ static inline int is_mergeable_anon_vma(struct anon_= vma *anon_vma1, if ((!anon_vma1 || !anon_vma2) && (!vma || list_is_singular(&vma->anon_vma_chain))) return 1; - return anon_vma1 =3D=3D anon_vma2; + if (anon_vma1 =3D=3D anon_vma2) + return 1; + /* + * Different anon_vma but not shared by several processes + */ + else if ((anon_vma1 && anon_vma2) && + (anon_vma1 =3D=3D anon_vma1->root) + && (rbt_no_children(anon_vma1))) + return 1; + /* + * Different anon_vma and shared -> unmergeable + */ + else + return 0; } =20 /* diff --git a/mm/rmap.c b/mm/rmap.c index 6a1e8c7f6213..1093b518b0be 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -387,6 +387,46 @@ int anon_vma_fork(struct vm_area_struct *vma, struct v= m_area_struct *pvma) return -ENOMEM; } =20 +/** + * reconnect_pages() - Reconnect physical pages from old to vma + * @vma: VMA to newly contain all physical pages of old + * @old: old VMA being merged to vma + */ +void reconnect_pages(struct vm_area_struct *vma, struct vm_area_struct *ol= d) +{ + struct anon_vma *anon_vma1 =3D vma->anon_vma; + struct anon_vma *anon_vma2 =3D old->anon_vma; + unsigned long pg_iter; + int pg_iters; + + if (anon_vma1 =3D=3D anon_vma2 || anon_vma1 =3D=3D NULL || anon_vma2 =3D= =3D NULL) + return; /* Nothing to do */ + + /* Modify page->mapping for all pages in old */ + pg_iter =3D 0; + pg_iters =3D (old->vm_end - old->vm_start) >> PAGE_SHIFT; + + for (; pg_iter < pg_iters; ++pg_iter) { + /* Get the physical page */ + unsigned long shift =3D pg_iter << PAGE_SHIFT; + struct page *phys_page =3D follow_page(old, old->vm_start + shift, FOLL_= GET); + struct anon_vma *page_anon_vma; + + /* Do some checks and lock the page */ + if (phys_page =3D=3D NULL) + continue; /* Virtual memory page is not mapped */ + lock_page(phys_page); + page_anon_vma =3D page_get_anon_vma(phys_page); + if (page_anon_vma !=3D NULL) { /* NULL in case of ZERO_PAGE */ + VM_BUG_ON_VMA(page_anon_vma !=3D old->anon_vma, old); + /* Update physical page's mapping */ + page_move_anon_rmap(phys_page, vma); + } + unlock_page(phys_page); + put_page(phys_page); + } +} + void unlink_anon_vmas(struct vm_area_struct *vma) { struct anon_vma_chain *avc, *next; --=20 2.34.1 From nobody Sat Jun 27 22:29:07 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7BD08C433EF for ; Fri, 18 Feb 2022 12:21:53 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235155AbiBRMWH (ORCPT ); Fri, 18 Feb 2022 07:22:07 -0500 Received: from mxb-00190b01.gslb.pphosted.com ([23.128.96.19]:48808 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234399AbiBRMVz (ORCPT ); Fri, 18 Feb 2022 07:21:55 -0500 Received: from mail-ed1-x533.google.com (mail-ed1-x533.google.com [IPv6:2a00:1450:4864:20::533]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9020913F8E for ; Fri, 18 Feb 2022 04:21:38 -0800 (PST) Received: by mail-ed1-x533.google.com with SMTP id q17so15220340edd.4 for ; Fri, 18 Feb 2022 04:21:38 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=CxgcV2X+mf5tHHSItDMFwLt3GiNhfPQtXkA6xIDBOaU=; b=i+1343FYxgfR6BihLtuKedHVISnzI9jE9tFe2Iy+Vja5RWi3OdlAyRyq7Y2A2NVYgD SrzBmARjdNlbI0QD498t1atCBXDYXJdJkEDlPCVYlsDv3vjzcbC1lpEu6nT+pz/6VUH/ fjSxuXKAOl21MzctVSfD7wlk1JzeP9pepFcTPb600oPMNrbs/hR1ecYY/tfWQUDRSe8W ZRVn3+hrb0UuF42B1bL+kDTnJF34/+HDYhZr9WqCO+uCBkGaMasPI/0B21aucm9nVtwf R2DNmCO6QhOGoNAeAuOx7HSFxIpYqqq9gPDA/zocZ2ychPaHEnW5clSwLKI8uw/zCIv2 nYMg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=CxgcV2X+mf5tHHSItDMFwLt3GiNhfPQtXkA6xIDBOaU=; b=RKsSe9OxO9xtErmB0y3j6UhqJUcgYdNT3Kx8aC1g3Qeag05qJVARW7JwtZhB1zfcLz /EYzvXmH1EUY0W6rztRoSdqxOdR2qPP/3eGrox4VNRdhJWxr+X3khq8lkWJNe/sMIbai /PE3w8dOE/KKS9UMEGM8hVgexxfGXIAjBI7uiHDfSTs5DdiCRnfJW9JiTKAb0A8Eprda 3PxJPJNwW383T5iBpKPL6cG0fkvcdjiGr9oTBd91VrL7FQHo5TT83Uk+FXdQOnVAn4wH gVJDw1R/AzNa2xLElCj2GCrvoIwPA2Hw5lxsRBPA23WnWPW56sAIIWQ75QLKTg4upZWP PtHw== X-Gm-Message-State: AOAM533RSzvv3RSPVMwpAe4q1f/bwnhFdqzxedC8BDuQRd17Fx+e8Qpx s/SYGB3OBoCyQv9dQVdctNY= X-Google-Smtp-Source: ABdhPJwV5LisTlr091/VkeaHJPcpPnY56InIouFFedi91Q8riGKroZfFTvtysCIzt/O56p+X5gLZ4w== X-Received: by 2002:a05:6402:1a55:b0:410:a4b4:2a9d with SMTP id bf21-20020a0564021a5500b00410a4b42a9dmr7979016edb.45.1645186897081; Fri, 18 Feb 2022 04:21:37 -0800 (PST) Received: from orion.localdomain ([93.99.228.15]) by smtp.gmail.com with ESMTPSA id vl24sm2160551ejb.141.2022.02.18.04.21.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Feb 2022 04:21:35 -0800 (PST) Received: by orion.localdomain (Postfix, from userid 1003) id 4C26BA81D1; Fri, 18 Feb 2022 13:20:44 +0100 (CET) From: =?UTF-8?q?Jakub=20Mat=C4=9Bna?= To: linux-mm@kvack.org Cc: patches@lists.linux.dev, linux-kernel@vger.kernel.org, vbabka@suse.cz, mhocko@kernel.org, mgorman@techsingularity.net, willy@infradead.org, liam.howlett@oracle.com, hughd@google.com, kirill@shutemov.name, riel@surriel.com, rostedt@goodmis.org, peterz@infradead.org, =?UTF-8?q?Jakub=20Mat=C4=9Bna?= Subject: [RFC PATCH 4/4] [PATCH 4/4] mm: add tracing for VMA merges Date: Fri, 18 Feb 2022 13:20:19 +0100 Message-Id: <20220218122019.130274-5-matenajakub@gmail.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20220218122019.130274-1-matenajakub@gmail.com> References: <20220218122019.130274-1-matenajakub@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Adds trace support for vma_merge to measure successful and unsuccessful merges of two VMAs with distinct anon_vmas and also trace support for merges made possible by update of page offset made possible by a previous patch in this series. Signed-off-by: Jakub Mat=C4=9Bna --- include/trace/events/mmap.h | 55 ++++++++++++++++++++++++++++++++ mm/internal.h | 11 +++++++ mm/mmap.c | 63 ++++++++++++++++++++----------------- 3 files changed, 100 insertions(+), 29 deletions(-) diff --git a/include/trace/events/mmap.h b/include/trace/events/mmap.h index 4661f7ba07c0..9f6439e2ed2d 100644 --- a/include/trace/events/mmap.h +++ b/include/trace/events/mmap.h @@ -6,6 +6,7 @@ #define _TRACE_MMAP_H =20 #include +#include <../mm/internal.h> =20 TRACE_EVENT(vm_unmapped_area, =20 @@ -42,6 +43,60 @@ TRACE_EVENT(vm_unmapped_area, __entry->low_limit, __entry->high_limit, __entry->align_mask, __entry->align_offset) ); + +TRACE_EVENT(vm_av_merge, + + TP_PROTO(int merged, enum vma_merge_res merge_prev, enum vma_merge_res me= rge_next, enum vma_merge_res merge_both), + + TP_ARGS(merged, merge_prev, merge_next, merge_both), + + TP_STRUCT__entry( + __field(int, merged) + __field(enum vma_merge_res, predecessor_different_av) + __field(enum vma_merge_res, successor_different_av) + __field(enum vma_merge_res, predecessor_with_successor_different_av) + __field(int, diff_count) + __field(int, failed_count) + ), + + TP_fast_assign( + __entry->merged =3D merged =3D=3D 0; + __entry->predecessor_different_av =3D merge_prev; + __entry->successor_different_av =3D merge_next; + __entry->predecessor_with_successor_different_av =3D merge_both; + __entry->diff_count =3D (merge_prev =3D=3D AV_MERGE_DIFFERENT) + + (merge_next =3D=3D AV_MERGE_DIFFERENT) + (merge_both =3D=3D AV_MERGE_D= IFFERENT); + __entry->failed_count =3D (merge_prev =3D=3D AV_MERGE_FAILED) + + (merge_next =3D=3D AV_MERGE_FAILED) + (merge_both =3D=3D AV_MERGE_FAIL= ED); + ), + + TP_printk("merged=3D%d predecessor=3D%d successor=3D%d predecessor_with_s= uccessor=3D%d diff_count=3D%d failed_count=3D%d\n", + __entry->merged, + __entry->predecessor_different_av, __entry->successor_different_av, + __entry->predecessor_with_successor_different_av, + __entry->diff_count, __entry->failed_count) + +); + +TRACE_EVENT(vm_pgoff_merge, + + TP_PROTO(struct vm_area_struct *vma, bool anon_pgoff_updated), + + TP_ARGS(vma, anon_pgoff_updated), + + TP_STRUCT__entry( + __field(bool, faulted) + __field(bool, updated) + ), + + TP_fast_assign( + __entry->faulted =3D vma->anon_vma; + __entry->updated =3D anon_pgoff_updated; + ), + + TP_printk("faulted=3D%d updated=3D%d\n", + __entry->faulted, __entry->updated) +); #endif =20 /* This part must be outside protection */ diff --git a/mm/internal.h b/mm/internal.h index d80300392a19..b3e482175518 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -34,6 +34,17 @@ struct folio_batch; /* Do not use these with a slab allocator */ #define GFP_SLAB_BUG_MASK (__GFP_DMA32|__GFP_HIGHMEM|~__GFP_BITS_MASK) =20 +/* + * Following values indicate reason for merge success or failure. + */ +enum vma_merge_res { + MERGE_FAILED, + AV_MERGE_FAILED, + AV_MERGE_SAME, + MERGE_OK =3D AV_MERGE_SAME, + AV_MERGE_DIFFERENT, +}; + void page_writeback_init(void); =20 static inline void *folio_raw_mapping(struct folio *folio) diff --git a/mm/mmap.c b/mm/mmap.c index ed91d0cd2111..10c76c6a3288 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -1064,21 +1064,21 @@ static inline int is_mergeable_anon_vma(struct anon= _vma *anon_vma1, */ if ((!anon_vma1 || !anon_vma2) && (!vma || list_is_singular(&vma->anon_vma_chain))) - return 1; + return AV_MERGE_SAME; if (anon_vma1 =3D=3D anon_vma2) - return 1; + return AV_MERGE_SAME; /* * Different anon_vma but not shared by several processes */ else if ((anon_vma1 && anon_vma2) && (anon_vma1 =3D=3D anon_vma1->root) && (rbt_no_children(anon_vma1))) - return 1; + return AV_MERGE_DIFFERENT; /* * Different anon_vma and shared -> unmergeable */ else - return 0; + return AV_MERGE_FAILED; } =20 /* @@ -1099,12 +1099,10 @@ can_vma_merge_before(struct vm_area_struct *vma, un= signed long vm_flags, struct vm_userfaultfd_ctx vm_userfaultfd_ctx, const char *anon_name) { - if (is_mergeable_vma(vma, file, vm_flags, vm_userfaultfd_ctx, anon_name) = && - is_mergeable_anon_vma(anon_vma, vma->anon_vma, vma)) { + if (is_mergeable_vma(vma, file, vm_flags, vm_userfaultfd_ctx, anon_name)) if (vma->vm_pgoff =3D=3D vm_pgoff) - return 1; - } - return 0; + return is_mergeable_anon_vma(anon_vma, vma->anon_vma, vma); + return MERGE_FAILED; } =20 /* @@ -1121,14 +1119,13 @@ can_vma_merge_after(struct vm_area_struct *vma, uns= igned long vm_flags, struct vm_userfaultfd_ctx vm_userfaultfd_ctx, const char *anon_name) { - if (is_mergeable_vma(vma, file, vm_flags, vm_userfaultfd_ctx, anon_name) = && - is_mergeable_anon_vma(anon_vma, vma->anon_vma, vma)) { + if (is_mergeable_vma(vma, file, vm_flags, vm_userfaultfd_ctx, anon_name))= { pgoff_t vm_pglen; vm_pglen =3D vma_pages(vma); if (vma->vm_pgoff + vm_pglen =3D=3D vm_pgoff) - return 1; + return is_mergeable_anon_vma(anon_vma, vma->anon_vma, vma); } - return 0; + return MERGE_FAILED; } =20 /* @@ -1185,9 +1182,14 @@ struct vm_area_struct *vma_merge(struct mm_struct *m= m, pgoff_t pglen =3D (end - addr) >> PAGE_SHIFT; struct vm_area_struct *area, *next; int err; - int merge_prev =3D 0; - int merge_both =3D 0; - int merge_next =3D 0; + /* + * Following three variables are used to store values + * indicating wheather this VMA and its anon_vma can + * be merged and also the type of failure or success. + */ + enum vma_merge_res merge_prev =3D MERGE_FAILED; + enum vma_merge_res merge_both =3D MERGE_FAILED; + enum vma_merge_res merge_next =3D MERGE_FAILED; =20 /* * We later require that vma->vm_flags =3D=3D vm_flags, @@ -1210,38 +1212,39 @@ struct vm_area_struct *vma_merge(struct mm_struct *= mm, * Can we merge predecessor? */ if (prev && prev->vm_end =3D=3D addr && - mpol_equal(vma_policy(prev), policy) && - can_vma_merge_after(prev, vm_flags, + mpol_equal(vma_policy(prev), policy)) { + merge_prev =3D can_vma_merge_after(prev, vm_flags, anon_vma, file, pgoff, - vm_userfaultfd_ctx, anon_name)) { - merge_prev =3D true; + vm_userfaultfd_ctx, anon_name); } + /* * Can we merge successor? */ if (next && end =3D=3D next->vm_start && - mpol_equal(policy, vma_policy(next)) && - can_vma_merge_before(next, vm_flags, + mpol_equal(policy, vma_policy(next))) { + merge_next =3D can_vma_merge_before(next, vm_flags, anon_vma, file, pgoff+pglen, - vm_userfaultfd_ctx, anon_name)) { - merge_next =3D true; + vm_userfaultfd_ctx, anon_name); } + /* * Can we merge both predecessor and successor? */ - if (merge_prev && merge_next) + if (merge_prev >=3D MERGE_OK && merge_next >=3D MERGE_OK) { merge_both =3D is_mergeable_anon_vma(prev->anon_vma, next->anon_vma, NUL= L); + } =20 - if (merge_both) { /* cases 1, 6 */ + if (merge_both >=3D MERGE_OK) { /* cases 1, 6 */ err =3D __vma_adjust(prev, prev->vm_start, next->vm_end, prev->vm_pgoff, NULL, prev); area =3D prev; - } else if (merge_prev) { /* cases 2, 5, 7 */ + } else if (merge_prev >=3D MERGE_OK) { /* cases 2, 5, 7 */ err =3D __vma_adjust(prev, prev->vm_start, end, prev->vm_pgoff, NULL, prev); area =3D prev; - } else if (merge_next) { + } else if (merge_next >=3D MERGE_OK) { if (prev && addr < prev->vm_end) /* case 4 */ err =3D __vma_adjust(prev, prev->vm_start, addr, prev->vm_pgoff, NULL, next); @@ -1252,7 +1255,7 @@ struct vm_area_struct *vma_merge(struct mm_struct *mm, } else { err =3D -1; } - + trace_vm_av_merge(err, merge_prev, merge_next, merge_both); /* * Cannot merge with predecessor or successor or error in __vma_adjust? */ @@ -3359,6 +3362,8 @@ struct vm_area_struct *copy_vma(struct vm_area_struct= **vmap, /* * Source vma may have been merged into new_vma */ + trace_vm_pgoff_merge(vma, anon_pgoff_updated); + if (unlikely(vma_start >=3D new_vma->vm_start && vma_start < new_vma->vm_end)) { /* --=20 2.34.1