From nobody Wed Feb 11 08:36:04 2026 Received: from mail-vs1-f50.google.com (mail-vs1-f50.google.com [209.85.217.50]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5D5E018D622 for ; Wed, 29 Jan 2025 04:45:13 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.217.50 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738125915; cv=none; b=jl4afa8t+YW/DEcFaztCdvadiPgQqbKe6fSr5vX1jvRDlZ19FNVCeDTvUKA7FUULVIFjVIW1kM0rfSRGupOhSjF7nDjdm1mTtf0QM1MGKYaxPzf7/UVt3SsXRs4mAL3bAAHvLSZ5Fbn7U7LBmEXzC0YTGWupEdWpeL+csnI3UPI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738125915; c=relaxed/simple; bh=nEWD/iIDW1NsqXLJLuMecI2HU3mtlD48Y7xSlAL/GHg=; h=From:To:Cc:Subject:Date:Message-Id:MIME-Version; b=Dam7N3vZmvdf6CGiZOYRA9Z8pRw/cugNfkjPkhoTtICX97UhCoS0W6geKfbN5UFnsix2Q483YILacZW368gKpcuQIQ0VqDAwEyv4LqYxMuO/FjS6LDPdlIddu9M4RAhTC+B6uGOQoCG/StCe027wgmZsC48fZmu2tW0fPjNSa/A= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=Xbj/NTVp; arc=none smtp.client-ip=209.85.217.50 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="Xbj/NTVp" Received: by mail-vs1-f50.google.com with SMTP id ada2fe7eead31-4aff78a39e1so1961541137.1 for ; Tue, 28 Jan 2025 20:45:12 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1738125912; x=1738730712; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=+9XcNXeM1QEujOzHY53Adgxo3qdNO54rut5pG70LltU=; b=Xbj/NTVpQ5t/NX2xDTbQwNVzvxhpQLrHVS9ivnXjNaz5TJo6p7bsZEFVTJ//xDa0Lg eL6RpJfJ9ZDxv7r9Yq2uIdncDt9Lp7IVr78MKOu/V1uZz0pyeretsG6YfKV9iJGdP4Ke FIg7gfnZV1Nd/DpxRL827iG1+jzV153QHyFiL9ZSW0M4ziqLxvaqcFfFIZxfxzGYoa3h LPGbFNzcYMJ25itA/kYc3XluNkk6fOBc/MgSajJ9YGt4xtjeidqWRG2eyW1EcwpTyfeB +8Qm8LhwtRIoClHf1VqllGD4pprxo5jpOe79dIzaL4YUDhJMpSMgPfhWG7HaUtw30fcc gc8g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738125912; x=1738730712; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=+9XcNXeM1QEujOzHY53Adgxo3qdNO54rut5pG70LltU=; b=bAZAcEm4/rjRt502bsyxIAp8nfsTB4k/pqBuxF63XuEW7g2XymP+3jArx9tUogowG1 JLt7YQV4OWwAOUrtlERTsMDeS7XYLm+bRIhlKVbhc3VoCf3Zcqvndz9WEVuKbh+7YZHA qsKd/kNUfqkvWSeTkwGg4RxtlSnNR72IPeLDbQgKLuatjF+6vpWoC82DNaPMkORfPTqB NROCU+pObPmQhNtlBOZhpjrEWtZ/gSh37qO+IvrlH0jE+Hkt8myEMQ/bs8yE/qVXdTrI +vt336mCRWKtN0DKnmgfH0KCJDTeG3/BR9Z66TQ+6nUMXTH7xXbY0siwMVjh9jWPhxJa 3AiQ== X-Forwarded-Encrypted: i=1; AJvYcCXgA4TMlIBRNNeaepZyVSthIPaOc0czs5anuZHMkfvddyLbHZwR8Y24ZEWEOgK2okGzQyDp964yzUsR1mk=@vger.kernel.org X-Gm-Message-State: AOJu0YzT85GXAz0Cp+CdliIZRyt+LRJz8yjKRpuDZv2tCgsoeHizF9iu BLViOi0z3QCNzLGDythpSMWbR/Z7fH7VLBFEaU110R2WpU66bl9K X-Gm-Gg: ASbGnctIwwBLggzf0T8Y16YNvljwDyEtdcySekCs+6k7kfSAL/OK+d+oPWP7brQWrvI uEoV+6qR8BJxFcHq3Fu147Na6uidSiXGS82hD3+4Z9KJ9uHOpSr29a0iAYtKgdJlo+k3OoCIxbn QCAtbG1uFgZdg4vc5oWqXucQha5edhU60qMojt2mFjZmysZL5GN533FKMOgmtb/GWYp693OTfp5 /dJZHl+Y8kC+pvlWP3J9jugXzod/LgT5M6uXUD9QSWXffRSVFuskDVVmRxUBEzO11Sl7OwTqobS POZOIq+3E2gw7PIXQW2qRR9YBd4fdplLt2s= X-Google-Smtp-Source: AGHT+IGLDfmmyZArugeBD/bSr/dlEHHihMde2h6DDEiF4Lrqc1Gz82jsxVE7HOwX3BvQxcWPlXrtDQ== X-Received: by 2002:a05:6102:358d:b0:4b2:cc94:1879 with SMTP id ada2fe7eead31-4b9a5239825mr1359764137.22.1738125911774; Tue, 28 Jan 2025 20:45:11 -0800 (PST) Received: from localhost.localdomain ([177.194.69.10]) by smtp.googlemail.com with ESMTPSA id a1e0cc1a2514c-864a9b1c043sm2546740241.18.2025.01.28.20.45.09 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 28 Jan 2025 20:45:10 -0800 (PST) From: Pedro Demarchi Gomes To: damon@lists.linux.dev, linux-kernel@vger.kernel.org Cc: Pedro Demarchi Gomes Subject: [PATCH] damon: add feature to monitor only writes Date: Wed, 29 Jan 2025 01:40:41 -0300 Message-Id: <20250129044041.25884-1-pedrodemargomes@gmail.com> X-Mailer: git-send-email 2.30.2 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add damon operation DAMON_OPS_VADDR_WRITES to monitor only writes. Signed-off-by: Pedro Demarchi Gomes --- include/linux/damon.h | 5 +- mm/damon/Makefile | 2 +- mm/damon/ops-common.c | 80 +++++ mm/damon/ops-common.h | 3 + mm/damon/sysfs.c | 1 + mm/damon/vaddr-writes.c | 735 ++++++++++++++++++++++++++++++++++++++++ 6 files changed, 824 insertions(+), 2 deletions(-) create mode 100644 mm/damon/vaddr-writes.c diff --git a/include/linux/damon.h b/include/linux/damon.h index af525252b853..9a6027faa7a6 100644 --- a/include/linux/damon.h +++ b/include/linux/damon.h @@ -485,6 +485,7 @@ struct damos { * enum damon_ops_id - Identifier for each monitoring operations implement= ation * * @DAMON_OPS_VADDR: Monitoring operations for virtual address spaces + * @DAMON_OPS_VADDR_WRITES: Monitoring only write operations for virtual a= ddress spaces * @DAMON_OPS_FVADDR: Monitoring operations for only fixed ranges of virtu= al * address spaces * @DAMON_OPS_PADDR: Monitoring operations for the physical address space @@ -492,6 +493,7 @@ struct damos { */ enum damon_ops_id { DAMON_OPS_VADDR, + DAMON_OPS_VADDR_WRITES, DAMON_OPS_FVADDR, DAMON_OPS_PADDR, NR_DAMON_OPS, @@ -846,7 +848,8 @@ int damon_select_ops(struct damon_ctx *ctx, enum damon_= ops_id id); =20 static inline bool damon_target_has_pid(const struct damon_ctx *ctx) { - return ctx->ops.id =3D=3D DAMON_OPS_VADDR || ctx->ops.id =3D=3D DAMON_OPS= _FVADDR; + return ctx->ops.id =3D=3D DAMON_OPS_VADDR || ctx->ops.id =3D=3D DAMON_OPS= _VADDR_WRITES + || ctx->ops.id =3D=3D DAMON_OPS_FVADDR; } =20 static inline unsigned int damon_max_nr_accesses(const struct damon_attrs = *attrs) diff --git a/mm/damon/Makefile b/mm/damon/Makefile index 8b49012ba8c3..c3c8dce0b34a 100644 --- a/mm/damon/Makefile +++ b/mm/damon/Makefile @@ -1,7 +1,7 @@ # SPDX-License-Identifier: GPL-2.0 =20 obj-y :=3D core.o -obj-$(CONFIG_DAMON_VADDR) +=3D ops-common.o vaddr.o +obj-$(CONFIG_DAMON_VADDR) +=3D ops-common.o vaddr.o vaddr-writes.o obj-$(CONFIG_DAMON_PADDR) +=3D ops-common.o paddr.o obj-$(CONFIG_DAMON_SYSFS) +=3D sysfs-common.o sysfs-schemes.o sysfs.o obj-$(CONFIG_DAMON_RECLAIM) +=3D modules-common.o reclaim.o diff --git a/mm/damon/ops-common.c b/mm/damon/ops-common.c index d25d99cb5f2b..4a3cad303a60 100644 --- a/mm/damon/ops-common.c +++ b/mm/damon/ops-common.c @@ -9,6 +9,8 @@ #include #include #include +#include +#include =20 #include "ops-common.h" =20 @@ -67,6 +69,84 @@ void damon_pmdp_mkold(pmd_t *pmd, struct vm_area_struct = *vma, unsigned long addr #endif /* CONFIG_TRANSPARENT_HUGEPAGE */ } =20 +static inline bool pte_is_pinned(struct vm_area_struct *vma, unsigned long= addr, pte_t pte) +{ + struct folio *folio; + + if (!pte_write(pte)) + return false; + if (!is_cow_mapping(vma->vm_flags)) + return false; + if (likely(!test_bit(MMF_HAS_PINNED, &vma->vm_mm->flags))) + return false; + folio =3D vm_normal_folio(vma, addr, pte); + if (!folio) + return false; + return folio_maybe_dma_pinned(folio); +} + +static inline void clear_soft_dirty_pmd(struct vm_area_struct *vma, + unsigned long addr, pmd_t *pmdp) +{ + pmd_t old, pmd =3D *pmdp; + + if (pmd_present(pmd)) { + /* See comment in change_huge_pmd() */ + old =3D pmdp_invalidate(vma, addr, pmdp); + if (pmd_dirty(old)) + pmd =3D pmd_mkdirty(pmd); + if (pmd_young(old)) + pmd =3D pmd_mkyoung(pmd); + + pmd =3D pmd_wrprotect(pmd); + pmd =3D pmd_clear_soft_dirty(pmd); + + set_pmd_at(vma->vm_mm, addr, pmdp, pmd); + } else if (is_migration_entry(pmd_to_swp_entry(pmd))) { + pmd =3D pmd_swp_clear_soft_dirty(pmd); + set_pmd_at(vma->vm_mm, addr, pmdp, pmd); + } +} + +static inline void clear_soft_dirty(struct vm_area_struct *vma, + unsigned long addr, pte_t *pte) +{ + /* + * The soft-dirty tracker uses #PF-s to catch writes + * to pages, so write-protect the pte as well. See the + * Documentation/admin-guide/mm/soft-dirty.rst for full description + * of how soft-dirty works. + */ + pte_t ptent =3D *pte; + + if (pte_present(ptent)) { + pte_t old_pte; + + if (pte_is_pinned(vma, addr, ptent)) + return; + old_pte =3D ptep_modify_prot_start(vma, addr, pte); + ptent =3D pte_wrprotect(old_pte); + ptent =3D pte_clear_soft_dirty(ptent); + ptep_modify_prot_commit(vma, addr, pte, old_pte, ptent); + } else if (is_swap_pte(ptent)) { + ptent =3D pte_swp_clear_soft_dirty(ptent); + set_pte_at(vma->vm_mm, addr, pte, ptent); + } +} + +void damon_pmdp_clean_soft_dirty(pmd_t *pmd, struct vm_area_struct *vma, u= nsigned long addr) +{ + if (pmd_soft_dirty(*pmd)) + clear_soft_dirty_pmd(vma, addr, pmd); + +} + +void damon_ptep_clean_soft_dirty(pte_t *pte, struct vm_area_struct *vma, u= nsigned long addr) +{ + if (pte_soft_dirty(*pte)) + clear_soft_dirty(vma, addr, pte); +} + #define DAMON_MAX_SUBSCORE (100) #define DAMON_MAX_AGE_IN_LOG (32) =20 diff --git a/mm/damon/ops-common.h b/mm/damon/ops-common.h index 18d837d11bce..cbd35d228f5c 100644 --- a/mm/damon/ops-common.h +++ b/mm/damon/ops-common.h @@ -12,6 +12,9 @@ struct folio *damon_get_folio(unsigned long pfn); void damon_ptep_mkold(pte_t *pte, struct vm_area_struct *vma, unsigned lon= g addr); void damon_pmdp_mkold(pmd_t *pmd, struct vm_area_struct *vma, unsigned lon= g addr); =20 +void damon_ptep_clean_soft_dirty(pte_t *pte, struct vm_area_struct *vma, u= nsigned long addr); +void damon_pmdp_clean_soft_dirty(pmd_t *pmd, struct vm_area_struct *vma, u= nsigned long addr); + int damon_cold_score(struct damon_ctx *c, struct damon_region *r, struct damos *s); int damon_hot_score(struct damon_ctx *c, struct damon_region *r, diff --git a/mm/damon/sysfs.c b/mm/damon/sysfs.c index deeab04d3b46..f01b31f1dfb9 100644 --- a/mm/damon/sysfs.c +++ b/mm/damon/sysfs.c @@ -625,6 +625,7 @@ static const struct kobj_type damon_sysfs_attrs_ktype = =3D { /* This should match with enum damon_ops_id */ static const char * const damon_sysfs_ops_strs[] =3D { "vaddr", + "vaddr-writes", "fvaddr", "paddr", }; diff --git a/mm/damon/vaddr-writes.c b/mm/damon/vaddr-writes.c new file mode 100644 index 000000000000..b887a4bc6d11 --- /dev/null +++ b/mm/damon/vaddr-writes.c @@ -0,0 +1,735 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * DAMON Primitives for Virtual Address Spaces + * + * Author: SeongJae Park + */ + +#define pr_fmt(fmt) "damon-va-writes: " fmt + +#include +#include +#include +#include +#include +#include +#include + +#include "ops-common.h" + +#ifdef CONFIG_DAMON_VADDR_KUNIT_TEST +#undef DAMON_MIN_REGION +#define DAMON_MIN_REGION 1 +#endif + +/* + * 't->pid' should be the pointer to the relevant 'struct pid' having refe= rence + * count. Caller must put the returned task, unless it is NULL. + */ +static inline struct task_struct *damon_get_task_struct(struct damon_targe= t *t) +{ + return get_pid_task(t->pid, PIDTYPE_PID); +} + +/* + * Get the mm_struct of the given target + * + * Caller _must_ put the mm_struct after use, unless it is NULL. + * + * Returns the mm_struct of the target on success, NULL on failure + */ +static struct mm_struct *damon_get_mm(struct damon_target *t) +{ + struct task_struct *task; + struct mm_struct *mm; + + task =3D damon_get_task_struct(t); + if (!task) + return NULL; + + mm =3D get_task_mm(task); + put_task_struct(task); + return mm; +} + +/* + * Functions for the initial monitoring target regions construction + */ + +/* + * Size-evenly split a region into 'nr_pieces' small regions + * + * Returns 0 on success, or negative error code otherwise. + */ +static int damon_va_evenly_split_region(struct damon_target *t, + struct damon_region *r, unsigned int nr_pieces) +{ + unsigned long sz_orig, sz_piece, orig_end; + struct damon_region *n =3D NULL, *next; + unsigned long start; + unsigned int i; + + if (!r || !nr_pieces) + return -EINVAL; + + if (nr_pieces =3D=3D 1) + return 0; + + orig_end =3D r->ar.end; + sz_orig =3D damon_sz_region(r); + sz_piece =3D ALIGN_DOWN(sz_orig / nr_pieces, DAMON_MIN_REGION); + + if (!sz_piece) + return -EINVAL; + + r->ar.end =3D r->ar.start + sz_piece; + next =3D damon_next_region(r); + for (start =3D r->ar.end, i =3D 1; i < nr_pieces; start +=3D sz_piece, i+= +) { + n =3D damon_new_region(start, start + sz_piece); + if (!n) + return -ENOMEM; + damon_insert_region(n, r, next, t); + r =3D n; + } + /* complement last region for possible rounding error */ + if (n) + n->ar.end =3D orig_end; + + return 0; +} + +static unsigned long sz_range(struct damon_addr_range *r) +{ + return r->end - r->start; +} + +/* + * Find three regions separated by two biggest unmapped regions + * + * vma the head vma of the target address space + * regions an array of three address ranges that results will be saved + * + * This function receives an address space and finds three regions in it w= hich + * separated by the two biggest unmapped regions in the space. Please ref= er to + * below comments of '__damon_va_init_regions()' function to know why this= is + * necessary. + * + * Returns 0 if success, or negative error code otherwise. + */ +static int __damon_va_three_regions(struct mm_struct *mm, + struct damon_addr_range regions[3]) +{ + struct damon_addr_range first_gap =3D {0}, second_gap =3D {0}; + VMA_ITERATOR(vmi, mm, 0); + struct vm_area_struct *vma, *prev =3D NULL; + unsigned long start; + + /* + * Find the two biggest gaps so that first_gap > second_gap > others. + * If this is too slow, it can be optimised to examine the maple + * tree gaps. + */ + rcu_read_lock(); + for_each_vma(vmi, vma) { + unsigned long gap; + + if (!prev) { + start =3D vma->vm_start; + goto next; + } + gap =3D vma->vm_start - prev->vm_end; + + if (gap > sz_range(&first_gap)) { + second_gap =3D first_gap; + first_gap.start =3D prev->vm_end; + first_gap.end =3D vma->vm_start; + } else if (gap > sz_range(&second_gap)) { + second_gap.start =3D prev->vm_end; + second_gap.end =3D vma->vm_start; + } +next: + prev =3D vma; + } + rcu_read_unlock(); + + if (!sz_range(&second_gap) || !sz_range(&first_gap)) + return -EINVAL; + + /* Sort the two biggest gaps by address */ + if (first_gap.start > second_gap.start) + swap(first_gap, second_gap); + + /* Store the result */ + regions[0].start =3D ALIGN(start, DAMON_MIN_REGION); + regions[0].end =3D ALIGN(first_gap.start, DAMON_MIN_REGION); + regions[1].start =3D ALIGN(first_gap.end, DAMON_MIN_REGION); + regions[1].end =3D ALIGN(second_gap.start, DAMON_MIN_REGION); + regions[2].start =3D ALIGN(second_gap.end, DAMON_MIN_REGION); + regions[2].end =3D ALIGN(prev->vm_end, DAMON_MIN_REGION); + + return 0; +} + +/* + * Get the three regions in the given target (task) + * + * Returns 0 on success, negative error code otherwise. + */ +static int damon_va_three_regions(struct damon_target *t, + struct damon_addr_range regions[3]) +{ + struct mm_struct *mm; + int rc; + + mm =3D damon_get_mm(t); + if (!mm) + return -EINVAL; + + mmap_read_lock(mm); + rc =3D __damon_va_three_regions(mm, regions); + mmap_read_unlock(mm); + + mmput(mm); + return rc; +} + +/* + * Initialize the monitoring target regions for the given target (task) + * + * t the given target + * + * Because only a number of small portions of the entire address space + * is actually mapped to the memory and accessed, monitoring the unmapped + * regions is wasteful. That said, because we can deal with small noises, + * tracking every mapping is not strictly required but could even incur a = high + * overhead if the mapping frequently changes or the number of mappings is + * high. The adaptive regions adjustment mechanism will further help to d= eal + * with the noise by simply identifying the unmapped areas as a region that + * has no access. Moreover, applying the real mappings that would have ma= ny + * unmapped areas inside will make the adaptive mechanism quite complex. = That + * said, too huge unmapped areas inside the monitoring target should be re= moved + * to not take the time for the adaptive mechanism. + * + * For the reason, we convert the complex mappings to three distinct regio= ns + * that cover every mapped area of the address space. Also the two gaps + * between the three regions are the two biggest unmapped areas in the giv= en + * address space. In detail, this function first identifies the start and= the + * end of the mappings and the two biggest unmapped areas of the address s= pace. + * Then, it constructs the three regions as below: + * + * [mappings[0]->start, big_two_unmapped_areas[0]->start) + * [big_two_unmapped_areas[0]->end, big_two_unmapped_areas[1]->start) + * [big_two_unmapped_areas[1]->end, mappings[nr_mappings - 1]->end) + * + * As usual memory map of processes is as below, the gap between the heap = and + * the uppermost mmap()-ed region, and the gap between the lowermost mmap(= )-ed + * region and the stack will be two biggest unmapped regions. Because the= se + * gaps are exceptionally huge areas in usual address space, excluding the= se + * two biggest unmapped regions will be sufficient to make a trade-off. + * + * + * + * + * (other mmap()-ed regions and small unmapped regions) + * + * + * + */ +static void __damon_va_init_regions(struct damon_ctx *ctx, + struct damon_target *t) +{ + struct damon_target *ti; + struct damon_region *r; + struct damon_addr_range regions[3]; + unsigned long sz =3D 0, nr_pieces; + int i, tidx =3D 0; + + if (damon_va_three_regions(t, regions)) { + damon_for_each_target(ti, ctx) { + if (ti =3D=3D t) + break; + tidx++; + } + pr_debug("Failed to get three regions of %dth target\n", tidx); + return; + } + + for (i =3D 0; i < 3; i++) + sz +=3D regions[i].end - regions[i].start; + if (ctx->attrs.min_nr_regions) + sz /=3D ctx->attrs.min_nr_regions; + if (sz < DAMON_MIN_REGION) + sz =3D DAMON_MIN_REGION; + + /* Set the initial three regions of the target */ + for (i =3D 0; i < 3; i++) { + r =3D damon_new_region(regions[i].start, regions[i].end); + if (!r) { + pr_err("%d'th init region creation failed\n", i); + return; + } + damon_add_region(r, t); + + nr_pieces =3D (regions[i].end - regions[i].start) / sz; + damon_va_evenly_split_region(t, r, nr_pieces); + } +} + +/* Initialize '->regions_list' of every target (task) */ +static void damon_va_init(struct damon_ctx *ctx) +{ + struct damon_target *t; + + damon_for_each_target(t, ctx) { + /* the user may set the target regions as they want */ + if (!damon_nr_regions(t)) + __damon_va_init_regions(ctx, t); + } +} + +/* + * Update regions for current memory mappings + */ +static void damon_va_update(struct damon_ctx *ctx) +{ + struct damon_addr_range three_regions[3]; + struct damon_target *t; + + damon_for_each_target(t, ctx) { + if (damon_va_three_regions(t, three_regions)) + continue; + damon_set_regions(t, three_regions, 3); + } +} + +static int damon_mkold_pmd_entry(pmd_t *pmd, unsigned long addr, + unsigned long next, struct mm_walk *walk) +{ + pte_t *pte; + pmd_t pmde; + spinlock_t *ptl; + + if (pmd_trans_huge(pmdp_get(pmd))) { + ptl =3D pmd_lock(walk->mm, pmd); + pmde =3D pmdp_get(pmd); + + if (!pmd_present(pmde)) { + spin_unlock(ptl); + return 0; + } + + if (pmd_trans_huge(pmde)) { + // damon_pmdp_mkold(pmd, walk->vma, addr); + damon_pmdp_clean_soft_dirty(pmd, walk->vma, addr); + spin_unlock(ptl); + return 0; + } + spin_unlock(ptl); + } + + pte =3D pte_offset_map_lock(walk->mm, pmd, addr, &ptl); + if (!pte) { + walk->action =3D ACTION_AGAIN; + return 0; + } + if (!pte_present(ptep_get(pte))) + goto out; + // damon_ptep_mkold(pte, walk->vma, addr); + damon_ptep_clean_soft_dirty(pte, walk->vma, addr); +out: + pte_unmap_unlock(pte, ptl); + return 0; +} + +#ifdef CONFIG_HUGETLB_PAGE +static void damon_hugetlb_mkold(pte_t *pte, struct mm_struct *mm, + struct vm_area_struct *vma, unsigned long addr) +{ + bool referenced =3D false; + pte_t entry =3D huge_ptep_get(mm, addr, pte); + struct folio *folio =3D pfn_folio(pte_pfn(entry)); + unsigned long psize =3D huge_page_size(hstate_vma(vma)); + + folio_get(folio); + + if (pte_young(entry)) { + referenced =3D true; + entry =3D pte_mkold(entry); + set_huge_pte_at(mm, addr, pte, entry, psize); + } + + if (mmu_notifier_clear_young(mm, addr, + addr + huge_page_size(hstate_vma(vma)))) + referenced =3D true; + + if (referenced) + folio_set_young(folio); + + folio_set_idle(folio); + folio_put(folio); +} + +static int damon_mkold_hugetlb_entry(pte_t *pte, unsigned long hmask, + unsigned long addr, unsigned long end, + struct mm_walk *walk) +{ + struct hstate *h =3D hstate_vma(walk->vma); + spinlock_t *ptl; + pte_t entry; + + ptl =3D huge_pte_lock(h, walk->mm, pte); + entry =3D huge_ptep_get(walk->mm, addr, pte); + if (!pte_present(entry)) + goto out; + + damon_hugetlb_mkold(pte, walk->mm, walk->vma, addr); + +out: + spin_unlock(ptl); + return 0; +} +#else +#define damon_mkold_hugetlb_entry NULL +#endif /* CONFIG_HUGETLB_PAGE */ + +static const struct mm_walk_ops damon_mkold_ops =3D { + .pmd_entry =3D damon_mkold_pmd_entry, + // .hugetlb_entry =3D damon_mkold_hugetlb_entry, + .walk_lock =3D PGWALK_RDLOCK, +}; + +static void damon_va_mkold(struct mm_struct *mm, unsigned long addr) +{ + mmap_read_lock(mm); + walk_page_range(mm, addr, addr + 1, &damon_mkold_ops, NULL); + mmap_read_unlock(mm); +} + +/* + * Functions for the access checking of the regions + */ + +static void __damon_va_prepare_access_check(struct mm_struct *mm, + struct damon_region *r) +{ + r->sampling_addr =3D damon_rand(r->ar.start, r->ar.end); + + damon_va_mkold(mm, r->sampling_addr); +} + +static void damon_va_prepare_access_checks(struct damon_ctx *ctx) +{ + struct damon_target *t; + struct mm_struct *mm; + struct damon_region *r; + + damon_for_each_target(t, ctx) { + mm =3D damon_get_mm(t); + if (!mm) + continue; + damon_for_each_region(r, t) + __damon_va_prepare_access_check(mm, r); + mmput(mm); + } +} + +struct damon_young_walk_private { + /* size of the folio for the access checked virtual memory address */ + unsigned long *folio_sz; + bool young; +}; + +static int damon_young_pmd_entry(pmd_t *pmd, unsigned long addr, + unsigned long next, struct mm_walk *walk) +{ + pte_t *pte; + pte_t ptent; + spinlock_t *ptl; + struct folio *folio; + struct damon_young_walk_private *priv =3D walk->private; + +#ifdef CONFIG_TRANSPARENT_HUGEPAGE + if (pmd_trans_huge(pmdp_get(pmd))) { + pmd_t pmde; + + ptl =3D pmd_lock(walk->mm, pmd); + pmde =3D pmdp_get(pmd); + + if (!pmd_present(pmde)) { + spin_unlock(ptl); + return 0; + } + + if (!pmd_trans_huge(pmde)) { + spin_unlock(ptl); + goto regular_page; + } + folio =3D damon_get_folio(pmd_pfn(pmde)); + if (!folio) + goto huge_out; + if (pmd_soft_dirty(pmde)) + priv->young =3D true; + *priv->folio_sz =3D HPAGE_PMD_SIZE; + folio_put(folio); +huge_out: + spin_unlock(ptl); + return 0; + } + +regular_page: +#endif /* CONFIG_TRANSPARENT_HUGEPAGE */ + + pte =3D pte_offset_map_lock(walk->mm, pmd, addr, &ptl); + if (!pte) { + walk->action =3D ACTION_AGAIN; + return 0; + } + ptent =3D ptep_get(pte); + if (!pte_present(ptent)) + goto out; + folio =3D damon_get_folio(pte_pfn(ptent)); + if (!folio) + goto out; + if (pte_soft_dirty(ptent)) + priv->young =3D true; + *priv->folio_sz =3D folio_size(folio); + folio_put(folio); +out: + pte_unmap_unlock(pte, ptl); + return 0; +} + +#ifdef CONFIG_HUGETLB_PAGE +static int damon_young_hugetlb_entry(pte_t *pte, unsigned long hmask, + unsigned long addr, unsigned long end, + struct mm_walk *walk) +{ + struct damon_young_walk_private *priv =3D walk->private; + struct hstate *h =3D hstate_vma(walk->vma); + struct folio *folio; + spinlock_t *ptl; + pte_t entry; + + ptl =3D huge_pte_lock(h, walk->mm, pte); + entry =3D huge_ptep_get(walk->mm, addr, pte); + if (!pte_present(entry)) + goto out; + + folio =3D pfn_folio(pte_pfn(entry)); + folio_get(folio); + + if (pte_young(entry) || !folio_test_idle(folio) || + mmu_notifier_test_young(walk->mm, addr)) + priv->young =3D true; + *priv->folio_sz =3D huge_page_size(h); + + folio_put(folio); + +out: + spin_unlock(ptl); + return 0; +} +#else +#define damon_young_hugetlb_entry NULL +#endif /* CONFIG_HUGETLB_PAGE */ + +static const struct mm_walk_ops damon_young_ops =3D { + .pmd_entry =3D damon_young_pmd_entry, + .hugetlb_entry =3D damon_young_hugetlb_entry, + .walk_lock =3D PGWALK_RDLOCK, +}; + +static bool damon_va_young(struct mm_struct *mm, unsigned long addr, + unsigned long *folio_sz) +{ + struct damon_young_walk_private arg =3D { + .folio_sz =3D folio_sz, + .young =3D false, + }; + + mmap_read_lock(mm); + walk_page_range(mm, addr, addr + 1, &damon_young_ops, &arg); + mmap_read_unlock(mm); + return arg.young; +} + +/* + * Check whether the region was accessed after the last preparation + * + * mm 'mm_struct' for the given virtual address space + * r the region to be checked + */ +static void __damon_va_check_access(struct mm_struct *mm, + struct damon_region *r, bool same_target, + struct damon_attrs *attrs) +{ + static unsigned long last_addr; + static unsigned long last_folio_sz =3D PAGE_SIZE; + static bool last_accessed; + + if (!mm) { + damon_update_region_access_rate(r, false, attrs); + return; + } + + /* If the region is in the last checked page, reuse the result */ + if (same_target && (ALIGN_DOWN(last_addr, last_folio_sz) =3D=3D + ALIGN_DOWN(r->sampling_addr, last_folio_sz))) { + damon_update_region_access_rate(r, last_accessed, attrs); + return; + } + + last_accessed =3D damon_va_young(mm, r->sampling_addr, &last_folio_sz); + damon_update_region_access_rate(r, last_accessed, attrs); + + last_addr =3D r->sampling_addr; +} + +static unsigned int damon_va_check_accesses(struct damon_ctx *ctx) +{ + struct damon_target *t; + struct mm_struct *mm; + struct damon_region *r; + unsigned int max_nr_accesses =3D 0; + bool same_target; + + damon_for_each_target(t, ctx) { + mm =3D damon_get_mm(t); + same_target =3D false; + damon_for_each_region(r, t) { + __damon_va_check_access(mm, r, same_target, + &ctx->attrs); + max_nr_accesses =3D max(r->nr_accesses, max_nr_accesses); + same_target =3D true; + } + if (mm) + mmput(mm); + } + + return max_nr_accesses; +} + +/* + * Functions for the target validity check and cleanup + */ + +static bool damon_va_target_valid(struct damon_target *t) +{ + struct task_struct *task; + + task =3D damon_get_task_struct(t); + if (task) { + put_task_struct(task); + return true; + } + + return false; +} + +#ifndef CONFIG_ADVISE_SYSCALLS +static unsigned long damos_madvise(struct damon_target *target, + struct damon_region *r, int behavior) +{ + return 0; +} +#else +static unsigned long damos_madvise(struct damon_target *target, + struct damon_region *r, int behavior) +{ + struct mm_struct *mm; + unsigned long start =3D PAGE_ALIGN(r->ar.start); + unsigned long len =3D PAGE_ALIGN(damon_sz_region(r)); + unsigned long applied; + + mm =3D damon_get_mm(target); + if (!mm) + return 0; + + applied =3D do_madvise(mm, start, len, behavior) ? 0 : len; + mmput(mm); + + return applied; +} +#endif /* CONFIG_ADVISE_SYSCALLS */ + +static unsigned long damon_va_apply_scheme(struct damon_ctx *ctx, + struct damon_target *t, struct damon_region *r, + struct damos *scheme) +{ + int madv_action; + + switch (scheme->action) { + case DAMOS_WILLNEED: + madv_action =3D MADV_WILLNEED; + break; + case DAMOS_COLD: + madv_action =3D MADV_COLD; + break; + case DAMOS_PAGEOUT: + madv_action =3D MADV_PAGEOUT; + break; + case DAMOS_HUGEPAGE: + madv_action =3D MADV_HUGEPAGE; + break; + case DAMOS_NOHUGEPAGE: + madv_action =3D MADV_NOHUGEPAGE; + break; + case DAMOS_STAT: + return 0; + default: + /* + * DAMOS actions that are not yet supported by 'vaddr'. + */ + return 0; + } + + return damos_madvise(t, r, madv_action); +} + +static int damon_va_scheme_score(struct damon_ctx *context, + struct damon_target *t, struct damon_region *r, + struct damos *scheme) +{ + + switch (scheme->action) { + case DAMOS_PAGEOUT: + return damon_cold_score(context, r, scheme); + default: + break; + } + + return DAMOS_MAX_SCORE; +} + +static int __init damon_va_initcall(void) +{ + struct damon_operations ops =3D { + .id =3D DAMON_OPS_VADDR_WRITES, + .init =3D damon_va_init, + .update =3D damon_va_update, + .prepare_access_checks =3D damon_va_prepare_access_checks, + .check_accesses =3D damon_va_check_accesses, + .reset_aggregated =3D NULL, + .target_valid =3D damon_va_target_valid, + .cleanup =3D NULL, + .apply_scheme =3D damon_va_apply_scheme, + .get_scheme_score =3D damon_va_scheme_score, + }; + /* ops for fixed virtual address ranges */ + struct damon_operations ops_fvaddr =3D ops; + int err; + + /* Don't set the monitoring target regions for the entire mapping */ + ops_fvaddr.id =3D DAMON_OPS_FVADDR; + ops_fvaddr.init =3D NULL; + ops_fvaddr.update =3D NULL; + + err =3D damon_register_ops(&ops); + if (err) + return err; + return damon_register_ops(&ops_fvaddr); +}; + +subsys_initcall(damon_va_initcall); + +#include "tests/vaddr-kunit.h" --=20 2.30.2