From nobody Wed Oct 1 22:33:22 2025 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7B18528C871 for ; Fri, 26 Sep 2025 21:17:00 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758921425; cv=none; b=BhHvGiFrhmidgj5OE1m4dD7i3BWOodkcY6NzOkb9ykkonzKCddWjNPxCY6/4ivGS/4e1o24qEzItdtbSuOcanUiJpobersZF4XkfoSbGg35Wij8y24oQ5mzXHbpTj7nomX63JpdSlrZ+u+G/PGgNaUVnyXNsertQT0oyleAsOMI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758921425; c=relaxed/simple; bh=350FlD+YQ8YFDKnfapWYOsdF+kFMQN+xTfNtEgMMs0Y=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=fbBrYvinArbvpu4NK5m0U0K9NS/aLdphINMweCuiEhbEw1+SM2fWG4fxzvRjXPy8PLMwuLOsgElWNfDRrywbflZ8Tsl/yqJsox2hzT4oMVc772/RCogiEwiOSWiWfQn7SxLVEwN4lOlKKIT+D3Va0Dgx0nR5pEXKnpmJhUgE6us= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=QFMaCZ1a; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="QFMaCZ1a" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1758921419; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=d5yeOKo+xy2g0cnjvr8gakW6SdH6gdiBhq2xrS+i8Dw=; b=QFMaCZ1aEekDasURp7g1KWkVwh0MwnljOrEemheNO2K/75ZuJvcGhJmPsgmCySGP51rLyi JjpNVrt3WtZ1v9RIwQl1dlGteA+Y6n96u1MjJr8F1LERo8vLeCHl1gxDb6lVzpVS6WXiVI +4Pp8wBJHnbQ0RtR0GFJqy3CGd7X+ZI= Received: from mail-qt1-f199.google.com (mail-qt1-f199.google.com [209.85.160.199]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-235-6WsRYGq9PuCFryKdky68yQ-1; Fri, 26 Sep 2025 17:16:57 -0400 X-MC-Unique: 6WsRYGq9PuCFryKdky68yQ-1 X-Mimecast-MFC-AGG-ID: 6WsRYGq9PuCFryKdky68yQ_1758921417 Received: by mail-qt1-f199.google.com with SMTP id d75a77b69052e-4b5d58d226cso65044841cf.1 for ; Fri, 26 Sep 2025 14:16:57 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1758921417; x=1759526217; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=d5yeOKo+xy2g0cnjvr8gakW6SdH6gdiBhq2xrS+i8Dw=; b=kBIekeFM+kIjHdqZzWomggA1hX9JzD/++hR9SBgFMS8KWXGjCcp94wEUc3vWFMEQla 2vQGej0QgWdJ8Ibc85aR3V6HU/0zVkOKS/Ap1BQrJmGIvBgPLE6NnXiHJA8PwliWfdki zYQ6OwbzZZP3cr4EZDmLoV7VgviwFrN00DAOm4Iqfohug/CGQKYv1XQad/8YKnmzMeYI MZUgPCv70IuI6uRrhbtZ9H/36fRJlU6Hw4bu3RZWtA9ZeY4QVM909Q7nWkErO3TdeYYh 84g7QBEEc/OMLuP9zOxEz2bTCqQjglnWmQxgmq9wYNgj8A6Ahb5/CNo317LfdM2mIfUQ qe9w== X-Forwarded-Encrypted: i=1; AJvYcCWYklc5h/tAABA5GD6/jMUr5NMur8Olx6VsMCkVBnoPZDIspQkiTOZRGsBZX26A11js9Sq7EIJ8lHI8+SQ=@vger.kernel.org X-Gm-Message-State: AOJu0YxJc258dbAzyaKeV6l76c86bXmHZz7aQswFxYGcQL8eFSYBV1ek o4qen4VrNFxXSUiXKTJAGFHbIDMJRhbgBAUG5oskKbcgBRj2+Q6ZoSfDT5w7A46uQHraPjrXeRV b09ZYAB4LZnMIItLT5Tb7RarE1AcFxoLvRhgL0yonOdROyGgyaCR/iJbM0s7qSV8jLA== X-Gm-Gg: ASbGncv3Xi1Nq+1c1Y0SCy2iBewYVEG4J6xdw1w7yhiTHYaT8y4LpoYVaInlJElWXWo Mtj4nXbANDWZWAm+McozWmkucTbixNe7zXjzvlSE4/gdzbtdMXWf/MSEquUAxBU31OgRN5+e+lD RmqR4h4oHHD+YDyi7+ZyY8QGVuFHD4BCMSrW/eNbB64B9glkdW03C1nV49OE2RHS4jVS7wXaba+ 9SbYvv16qk99wKrgi0RCrFdZ/npcIkGj93JQT1GcnsuL9/LtY+cNhMO9ZS4zpsPY/2ZbGWh+4e4 lEm4+jlaz/JCBWD7CsbRnCIuT07b0w== X-Received: by 2002:a05:622a:5c1a:b0:4ca:10bd:baef with SMTP id d75a77b69052e-4da488a0db8mr127805751cf.27.1758921416704; Fri, 26 Sep 2025 14:16:56 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGBcMQvy1EznIKbV08RyV+Az/cmOQuuNvDrJwaaM3Rmt+xpQZNuQS8EHNMhqLjuus0FkmUnnw== X-Received: by 2002:a05:622a:5c1a:b0:4ca:10bd:baef with SMTP id d75a77b69052e-4da488a0db8mr127805241cf.27.1758921416109; Fri, 26 Sep 2025 14:16:56 -0700 (PDT) Received: from x1.com ([142.188.210.50]) by smtp.gmail.com with ESMTPSA id af79cd13be357-86042e32249sm210604785a.44.2025.09.26.14.16.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 26 Sep 2025 14:16:55 -0700 (PDT) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Axel Rasmussen , Vlastimil Babka , James Houghton , Nikita Kalyazin , David Hildenbrand , Lorenzo Stoakes , Ujwal Kundur , Mike Rapoport , Andrew Morton , peterx@redhat.com, Andrea Arcangeli , "Liam R . Howlett" , Michal Hocko , Muchun Song , Oscar Salvador , Hugh Dickins , Suren Baghdasaryan Subject: [PATCH v3 1/4] mm: Introduce vm_uffd_ops API Date: Fri, 26 Sep 2025 17:16:47 -0400 Message-ID: <20250926211650.525109-2-peterx@redhat.com> X-Mailer: git-send-email 2.50.1 In-Reply-To: <20250926211650.525109-1-peterx@redhat.com> References: <20250926211650.525109-1-peterx@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Currently, most of the userfaultfd features are implemented directly in the core mm. It will invoke VMA specific functions whenever necessary. So far it is fine because it almost only interacts with shmem and hugetlbfs. Introduce a generic userfaultfd API extension for vm_operations_struct, so that any code that implements vm_operations_struct (including kernel modules that can be compiled separately from the kernel core) can support userfaults without modifying the core files. With this API applied, if a module wants to support userfaultfd, the module should only need to properly define vm_uffd_ops and hook it to vm_operations_struct, instead of changing anything in core mm. This API will not work for anonymous memory. Handling of userfault operations for anonymous memory remains unchanged in core mm. Due to a security concern while reviewing older versions of this series [1], uffd_copy() will be temprorarily removed. IOW, so far MISSING-capable memory types can only be hard-coded and implemented in mm/. It would also affect UFFDIO_COPY and UFFDIO_ZEROPAGE. Other functions should still be able to be provided from vm_uffd_ops. Introduces the API only so that existing userfaultfd users can be moved over without breaking them. [1] https://lore.kernel.org/all/20250627154655.2085903-1-peterx@redhat.com/ Signed-off-by: Peter Xu --- include/linux/mm.h | 9 +++++++++ include/linux/userfaultfd_k.h | 37 +++++++++++++++++++++++++++++++++++ 2 files changed, 46 insertions(+) diff --git a/include/linux/mm.h b/include/linux/mm.h index 6b6c6980f46c2..8afb93387e2c6 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -620,6 +620,8 @@ struct vm_fault { */ }; =20 +struct vm_uffd_ops; + /* * These are the virtual MM functions - opening of an area, closing and * unmapping it (needed to keep files on disk up-to-date etc), pointer @@ -705,6 +707,13 @@ struct vm_operations_struct { struct page *(*find_normal_page)(struct vm_area_struct *vma, unsigned long addr); #endif /* CONFIG_FIND_NORMAL_PAGE */ +#ifdef CONFIG_USERFAULTFD + /* + * Userfaultfd related ops. Modules need to define this to support + * userfaultfd. + */ + const struct vm_uffd_ops *userfaultfd_ops; +#endif }; =20 #ifdef CONFIG_NUMA_BALANCING diff --git a/include/linux/userfaultfd_k.h b/include/linux/userfaultfd_k.h index c0e716aec26aa..b1949d8611238 100644 --- a/include/linux/userfaultfd_k.h +++ b/include/linux/userfaultfd_k.h @@ -92,6 +92,43 @@ enum mfill_atomic_mode { NR_MFILL_ATOMIC_MODES, }; =20 +/* VMA userfaultfd operations */ +struct vm_uffd_ops { + /** + * @uffd_features: features supported in bitmask. + * + * When the ops is defined, the driver must set non-zero features + * to be a subset (or all) of: VM_UFFD_MISSING|WP|MINOR. + * + * NOTE: VM_UFFD_MISSING is still only supported under mm/ so far. + */ + unsigned long uffd_features; + /** + * @uffd_ioctls: ioctls supported in bitmask. + * + * Userfaultfd ioctls supported by the module. Below will always + * be supported by default whenever a module provides vm_uffd_ops: + * + * _UFFDIO_API, _UFFDIO_REGISTER, _UFFDIO_UNREGISTER, _UFFDIO_WAKE + * + * The module needs to provide all the rest optionally supported + * ioctls. For example, when VM_UFFD_MINOR is supported, + * _UFFDIO_CONTINUE must be supported as an ioctl. + */ + unsigned long uffd_ioctls; + /** + * uffd_get_folio: Handler to resolve UFFDIO_CONTINUE request. + * + * @inode: the inode for folio lookup + * @pgoff: the pgoff of the folio + * @folio: returned folio pointer + * + * Return: zero if succeeded, negative for errors. + */ + int (*uffd_get_folio)(struct inode *inode, pgoff_t pgoff, + struct folio **folio); +}; + #define MFILL_ATOMIC_MODE_BITS (const_ilog2(NR_MFILL_ATOMIC_MODES - 1) + 1) #define MFILL_ATOMIC_BIT(nr) BIT(MFILL_ATOMIC_MODE_BITS + (nr)) #define MFILL_ATOMIC_FLAG(nr) ((__force uffd_flags_t) MFILL_ATOMIC_BIT(nr)) --=20 2.50.1 From nobody Wed Oct 1 22:33:22 2025 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7B1F728D83E for ; Fri, 26 Sep 2025 21:17:02 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758921424; cv=none; b=IHIDpVjevQTs7iuOl6arU1h8LPexPNub11Cg3mS8RYqZ+7hXNBikN5UPz5XmFCYRdUZktZCuhVHkAQ2DdaXrNSejpfxseJfKr5YjbdNK0PHA8BgYg2rDdg82GzHAx0ZQFuNe138voJsU7lYI/45gGbDYmCnx+3jIA+O10Nm1tlA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758921424; c=relaxed/simple; bh=gu11TsYv6vNBmZ2FzYsI3o2VkvQDcJu3rAdvQklLAVc=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=K9Gg1K9BLpXHJLVpOL5V8aVsO21LoShsiEy8ENmlMWcIsmmoWYJErjdUGTcLyGHB8vgrNob2uz2IwsS1TzyUmnAWfKmiOTbVYWmQfR80i5WPhd1LLCiw2mzpMpjcEOrvDlK2adtlX4M4hHmmnme80QGgTQPzMhh1myNAYTGM5lE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=gw4bZmBW; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="gw4bZmBW" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1758921421; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=w9EcAJwkVO1k6z+Bq8iB9jwHou2j8ZHFnEQnT5CP3gA=; b=gw4bZmBWufjE08lTv0z9h9Ms6O2fKceKwcFGyG/q/JAb1CMEk2qOMY/SsJq9EVMA4RXeQj yibCZBc6KrD7EALJMd9k2Ll8gXiqjFrlDukF1xY/tNnNaA9359r9gil38AnAmRYFQT7XXL bGAHZSLFlw1HVSJtp7iTyryPwFmCO1I= Received: from mail-qk1-f200.google.com (mail-qk1-f200.google.com [209.85.222.200]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-59-s9_PWsKIPsKcygqW0WAW6w-1; Fri, 26 Sep 2025 17:16:59 -0400 X-MC-Unique: s9_PWsKIPsKcygqW0WAW6w-1 X-Mimecast-MFC-AGG-ID: s9_PWsKIPsKcygqW0WAW6w_1758921419 Received: by mail-qk1-f200.google.com with SMTP id af79cd13be357-81ea2bb8602so782330085a.0 for ; Fri, 26 Sep 2025 14:16:59 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1758921419; x=1759526219; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=w9EcAJwkVO1k6z+Bq8iB9jwHou2j8ZHFnEQnT5CP3gA=; b=aKkGvgkwBnHZu3MDXr74CmfiwBTFz7LriwcF+VisUx0d1vHyyBKTYcO28fiXPlFQS5 ++ZX7ZB35T4g2bPrGZArr19S59WcQ2O5k1y7olCxufUS+8Dsgw2UcdLLfN2V8v8b6egl /EQ0PXYsjTgDfTdhiTHo+to2XrELqcAlfqX3XhsAt7qtXFCf986F2bs8x4NWJGzY9+ZF QtixauC02QcnKSJfB9Uxt4Ths5JvZfUAX7z7w5atsuRPidaQovzm4GTuA1/cJcLEJdy5 QopSNKlpE/1ZKTwq9xtjSF8RQ/UMAytSiY6lB1up9dKKutbjNq2r8B+II8e+MB1SMLmV iSWQ== X-Forwarded-Encrypted: i=1; AJvYcCVkjdpzVp0SAV9xNzOOpkZSqLO62DEQ4HyN2CkUEIizQ240jnfkW7by3lIgB9bKkr2yW7yESNc8SkyvV0o=@vger.kernel.org X-Gm-Message-State: AOJu0Yye6/dlseI8ftSbXqSu8/sfhEZG1cb1BdRS/5XsVgZ60KPO5Uf9 ji9px5Kx12iFBlPQLEdwBtaNA0in1fZdHvU8xV2AivDn2jd6mrX1sQ3ZPEzljVSlg+2K+fFqH8w 5capQz0J3t/bAUFt8acxU9RgWYwWqsndDxldSa2PvVt5Sc4V2SYzTBCD6cp+aURc9JA== X-Gm-Gg: ASbGncv9lQwgelRr0uGD6Mi8R6tVvJme3XJxuxNTbMrCpWACx4jercC8t+dqCgTobfw TbtrgNCv3cWRH8MYgItCUz8z00j8zj1Mql5zCVzVfCdRAU5bsR3mJYlY7k0fObioJUzHrlj9ueB njC8GmmMskk989ltII4nyvm9IvbZHl2OIgH1lvrmKd4ZyQlRb9A1IK9VwdlOD3fkYbkHWqJFaDE fkrZJU/DWSaIBvqS1jQbqe9MCbzpvul+1407BwrDl+ROTJYxXZQWkdWPu/88UHgXmUOojk3mJeq Ykc1KSSbnw2PtcGJJ54MQmmHEPGw0w== X-Received: by 2002:a05:620a:4722:b0:82f:5ffb:1704 with SMTP id af79cd13be357-85aea5ff651mr1128781585a.42.1758921418994; Fri, 26 Sep 2025 14:16:58 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGUPCqwUFOJYLq99ntR5y4uXNMimgWCraucnJeJODIKMmZEqHjL7gOw98u8EnRi2EraMkV6DA== X-Received: by 2002:a05:620a:4722:b0:82f:5ffb:1704 with SMTP id af79cd13be357-85aea5ff651mr1128776485a.42.1758921418443; Fri, 26 Sep 2025 14:16:58 -0700 (PDT) Received: from x1.com ([142.188.210.50]) by smtp.gmail.com with ESMTPSA id af79cd13be357-86042e32249sm210604785a.44.2025.09.26.14.16.56 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 26 Sep 2025 14:16:57 -0700 (PDT) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Axel Rasmussen , Vlastimil Babka , James Houghton , Nikita Kalyazin , David Hildenbrand , Lorenzo Stoakes , Ujwal Kundur , Mike Rapoport , Andrew Morton , peterx@redhat.com, Andrea Arcangeli , "Liam R . Howlett" , Michal Hocko , Muchun Song , Oscar Salvador , Hugh Dickins , Suren Baghdasaryan Subject: [PATCH v3 2/4] mm/shmem: Support vm_uffd_ops API Date: Fri, 26 Sep 2025 17:16:48 -0400 Message-ID: <20250926211650.525109-3-peterx@redhat.com> X-Mailer: git-send-email 2.50.1 In-Reply-To: <20250926211650.525109-1-peterx@redhat.com> References: <20250926211650.525109-1-peterx@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add support for the new vm_uffd_ops API for shmem. Note that this only introduces the support, the API is not yet used by core mm. It only needs a separate uffd_get_folio() definition but that's oneliner. Due to the limitation of the current vm_uffd_ops on MISSING mode support, the shmem UFFDIO_COPY/ZEROPAGE process are still hard-coded in mm/. Cc: Hugh Dickins Acked-by: Mike Rapoport Signed-off-by: Peter Xu --- mm/shmem.c | 25 +++++++++++++++++++++++++ 1 file changed, 25 insertions(+) diff --git a/mm/shmem.c b/mm/shmem.c index 4855eee227310..e7b44efbfddf2 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -3148,6 +3148,13 @@ static inline struct inode *shmem_get_inode(struct m= nt_idmap *idmap, #endif /* CONFIG_TMPFS_QUOTA */ =20 #ifdef CONFIG_USERFAULTFD + +static int shmem_uffd_get_folio(struct inode *inode, pgoff_t pgoff, + struct folio **folio) +{ + return shmem_get_folio(inode, pgoff, 0, folio, SGP_NOALLOC); +} + int shmem_mfill_atomic_pte(pmd_t *dst_pmd, struct vm_area_struct *dst_vma, unsigned long dst_addr, @@ -5191,6 +5198,18 @@ static int shmem_error_remove_folio(struct address_s= pace *mapping, return 0; } =20 +#ifdef CONFIG_USERFAULTFD +static const struct vm_uffd_ops shmem_uffd_ops =3D { + .uffd_features =3D __VM_UFFD_FLAGS, + .uffd_ioctls =3D BIT(_UFFDIO_COPY) | + BIT(_UFFDIO_ZEROPAGE) | + BIT(_UFFDIO_WRITEPROTECT) | + BIT(_UFFDIO_CONTINUE) | + BIT(_UFFDIO_POISON), + .uffd_get_folio =3D shmem_uffd_get_folio, +}; +#endif + static const struct address_space_operations shmem_aops =3D { .dirty_folio =3D noop_dirty_folio, #ifdef CONFIG_TMPFS @@ -5293,6 +5312,9 @@ static const struct vm_operations_struct shmem_vm_ops= =3D { .set_policy =3D shmem_set_policy, .get_policy =3D shmem_get_policy, #endif +#ifdef CONFIG_USERFAULTFD + .userfaultfd_ops =3D &shmem_uffd_ops, +#endif }; =20 static const struct vm_operations_struct shmem_anon_vm_ops =3D { @@ -5302,6 +5324,9 @@ static const struct vm_operations_struct shmem_anon_v= m_ops =3D { .set_policy =3D shmem_set_policy, .get_policy =3D shmem_get_policy, #endif +#ifdef CONFIG_USERFAULTFD + .userfaultfd_ops =3D &shmem_uffd_ops, +#endif }; =20 int shmem_init_fs_context(struct fs_context *fc) --=20 2.50.1 From nobody Wed Oct 1 22:33:22 2025 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C8E1C2A1CA for ; Fri, 26 Sep 2025 21:17:03 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758921425; cv=none; b=qjvHx0y6aUyUFLHhTvtAd9mbqne0ZkE/+qFdIXrKCZrN3EgH90rgoabWb3g/v1So7dUvyHMAs4DwWFF2G/Fi/IRixG/oBgmsUpFvA6FrUlEJjRPNdWR+vUCeFVHZjxhIagDntCQjUFbJJGDAs3PWk/Zu/ZgNHUcxt1JPFUjv93U= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758921425; c=relaxed/simple; bh=9sCwgzh6y2cb9lH0e/ECaYhP8HV6Rg1rlFoqbazSuaQ=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=IBlTmzRGo7XnZuQ+yGXKFA3tuYp5HkrysPkL2ALNp8bZL/Ty+uD+haB1AIO81WrSna8bpz8GqCVdJ2gavYuw0vto35DZOUPpvaQUNU4BEXupWoUvJZdQP4WmYw10Q5LVfTO9WegVAbjBN6fFeoaxWA8LlWDUb9nZSiBVl61QIU0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=CWeRzyO7; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="CWeRzyO7" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1758921422; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=FqvR9YQdn+VFm/f2VVsLMnfIc9MZ41bG1PMJxJgeG0M=; b=CWeRzyO7IAAymeDWtqArCSAZvEzFTaHF/huwj74UjW7ZSM7WPrNbNgw2/vrnUb/xwnyLSY VcQ9iAL3yn3hf40tajA1VQi/YZTS9Yn4n1ZKqdQMvIgCLlwK1/0BMm1b2r+q7xGcRNfgBq vdHxPL9c9UxPXn9mgATzEuIcAla5W+4= Received: from mail-qk1-f200.google.com (mail-qk1-f200.google.com [209.85.222.200]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-641-IHvQ6caUMA2RI4axFAFt3w-1; Fri, 26 Sep 2025 17:17:01 -0400 X-MC-Unique: IHvQ6caUMA2RI4axFAFt3w-1 X-Mimecast-MFC-AGG-ID: IHvQ6caUMA2RI4axFAFt3w_1758921421 Received: by mail-qk1-f200.google.com with SMTP id af79cd13be357-856c1aa079bso564843985a.0 for ; Fri, 26 Sep 2025 14:17:01 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1758921421; x=1759526221; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=FqvR9YQdn+VFm/f2VVsLMnfIc9MZ41bG1PMJxJgeG0M=; b=e14RVCRZxZLmikstfFCxVWh1jjGYQyX/xWurAd0j+dGkUN/s/jNpMsrAchE4iXuBMR ID776NRxScbM33TkbgbUAmPR7q+A1S6z93zV6wrNc47IalBfZQmi5SN676CFd2FDoFQb YMcO5DL2w+kGNPFVEvEEWl5wZrXEbU+3BWeW9Q4QR+Mf60upVBl1Pr1w/exV7Wu15aGw 8GPcn1lmqOmew1+J3yQLOmxbEsdlp7jYT3SCvuCR4Np8uHWbChCqoNqlcv3n+MQNpOYh 5r0oQwCE4WxcuPpWrT1KMvewEuACevEurMNgOtuLXP0x1ClgwEwa3b+d4/5dy7nwaUnp 6p3w== X-Forwarded-Encrypted: i=1; AJvYcCW761w/i5h84FQ3IYYbg6rPy/bV1R3t5itgknrXtsTpwVt5pR8dJFmSoRKi3lOujJU9PrO9atdWMrYlSSs=@vger.kernel.org X-Gm-Message-State: AOJu0YwUK9I+dBQW+wwapOHRELbwSSdh0bUbMA1Qo+/+hSfEPnws/Efc Fgqa4kzoFVAQ/mENct8iQtD3fagNt35nfcHlqnXDBN4GL/J6WM6/NYm7eGWYiEYrh4ObSbgWBwl i2Qo4GyhNoGeCCdkcnTaJn3hUACYCg4L94a9KSAmX9bo5MNEN7yzOyTt3I8uVFpIH/A== X-Gm-Gg: ASbGncu4YEuFjPk8kgv+tkxdritd1MKyfxp+fI2BDHT4YbF0bMeMSap74QBpX81riAL Bh0AirFV/oVn7dW9uwwXTthogJz32GaWFMoQ6kZl77szJrHSAj4PD5K1XDR0RSsisYDhhAit0Fi qOx/seS5sl+I4k0WRk9xuYypY1wzX/JEqAY1lVrnLoyI4t9Ym8VRuG0pVuH1ke8yZoWOXGNXFJT zLjxMdUDHAsJ6SXjhNVeqXQ2FS9CCYXrUCH6o9dHv2yCYb1FoaatoQ1ld0Ks6+WsljqrlGEbvqV CFu63wCITjWaZ42jHywOMtV6wMYrAQ== X-Received: by 2002:a05:620a:3f97:b0:828:b2ab:a50e with SMTP id af79cd13be357-86462eacc0bmr141585885a.31.1758921420828; Fri, 26 Sep 2025 14:17:00 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHgGSOUpzMqnLYfIUEw5sRaIdJx67JoUnQ561pUdKycpRK3QjpqhNcQ0PgHJkIkBHHGJBGvTw== X-Received: by 2002:a05:620a:3f97:b0:828:b2ab:a50e with SMTP id af79cd13be357-86462eacc0bmr141581285a.31.1758921420264; Fri, 26 Sep 2025 14:17:00 -0700 (PDT) Received: from x1.com ([142.188.210.50]) by smtp.gmail.com with ESMTPSA id af79cd13be357-86042e32249sm210604785a.44.2025.09.26.14.16.58 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 26 Sep 2025 14:16:59 -0700 (PDT) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Axel Rasmussen , Vlastimil Babka , James Houghton , Nikita Kalyazin , David Hildenbrand , Lorenzo Stoakes , Ujwal Kundur , Mike Rapoport , Andrew Morton , peterx@redhat.com, Andrea Arcangeli , "Liam R . Howlett" , Michal Hocko , Muchun Song , Oscar Salvador , Hugh Dickins , Suren Baghdasaryan Subject: [PATCH v3 3/4] mm/hugetlb: Support vm_uffd_ops API Date: Fri, 26 Sep 2025 17:16:49 -0400 Message-ID: <20250926211650.525109-4-peterx@redhat.com> X-Mailer: git-send-email 2.50.1 In-Reply-To: <20250926211650.525109-1-peterx@redhat.com> References: <20250926211650.525109-1-peterx@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add support for the new vm_uffd_ops API for hugetlb. Note that this only introduces the support, the API is not yet used by core mm. Due to legacy reasons, it's still not trivial to move hugetlb completely to the API (like shmem). But it will still use uffd_features and uffd_ioctls properly on the API because that's pretty general. Cc: Muchun Song Cc: Oscar Salvador Acked-by: Mike Rapoport Signed-off-by: Peter Xu --- mm/hugetlb.c | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index cb5c4e79e0b8f..b6eb92828ee15 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -5492,6 +5492,22 @@ static vm_fault_t hugetlb_vm_op_fault(struct vm_faul= t *vmf) return 0; } =20 +#ifdef CONFIG_USERFAULTFD +static const struct vm_uffd_ops hugetlb_uffd_ops =3D { + .uffd_features =3D __VM_UFFD_FLAGS, + /* _UFFDIO_ZEROPAGE not supported */ + .uffd_ioctls =3D BIT(_UFFDIO_COPY) | + BIT(_UFFDIO_WRITEPROTECT) | + BIT(_UFFDIO_CONTINUE) | + BIT(_UFFDIO_POISON), + /* + * Hugetlbfs still has its own hard-coded handler in userfaultfd, + * due to limitations similar to vm_operations_struct.fault(). + * TODO: generalize it to use the API functions. + */ +}; +#endif + /* * When a new function is introduced to vm_operations_struct and added * to hugetlb_vm_ops, please consider adding the function to shm_vm_ops. @@ -5505,6 +5521,9 @@ const struct vm_operations_struct hugetlb_vm_ops =3D { .close =3D hugetlb_vm_op_close, .may_split =3D hugetlb_vm_op_split, .pagesize =3D hugetlb_vm_op_pagesize, +#ifdef CONFIG_USERFAULTFD + .userfaultfd_ops =3D &hugetlb_uffd_ops, +#endif }; =20 static pte_t make_huge_pte(struct vm_area_struct *vma, struct folio *folio, --=20 2.50.1 From nobody Wed Oct 1 22:33:22 2025 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D11DE299A90 for ; Fri, 26 Sep 2025 21:17:05 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758921427; cv=none; b=YTGY+xP+TwKdGPfFXy79bTl3B5EO6pwx8x0yPMSjxKU9szirTNKHABpwWsancJyDRqNxxdMie/5jUFzIOI8abjxVEY0bkQmUaMloBjlLIs4r7u198kuEBbxXsw/HFAD32/FnB/+Xvv8FdyW6LZ6BnbcNkKAFJmiDmHloJg/y6gY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758921427; c=relaxed/simple; bh=gjo5O+4xbJwBl0erUhmhkNVZXRLKRiUIJdMMPeCETvY=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=eJTnpvjmdD+hClDTgSGWv1X2bDSCOexASA9xwpskK5HtR5PaePNxc/a0OaKKw6ioMIOsumpbXioYhwish5j/DoI8ABsrYsgmoHmaJyQ8FC/0HiAKm8Ve30Mlo4b8xxNUiyuLYSiIgG7mDSo3OeQdjV1CRdd1Ct+W95SCo/EQxs0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=E6VeSjWz; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="E6VeSjWz" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1758921424; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=+hVqOG3Eh1Yx9eqpcK28/avvzNcMaSdpE9ANunBKEMM=; b=E6VeSjWz4NPOdCUqAt0Im3bBOb5cEEHIYr2pwrx0MoZFpZEyjEZiRVJegSw2bI2i/LxGl5 C+OT6H6sotTPp6/KAfvPvatVIbO9VMRAXy4ljy/YluDsvo9VFNGhxWEmtqBwqaREerGu04 EQbTZVDHs9H5MwJ5NkbdXPlD04DdWNg= Received: from mail-qk1-f199.google.com (mail-qk1-f199.google.com [209.85.222.199]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-612-ImLrR4wrObKxC6XmC6N9cQ-1; Fri, 26 Sep 2025 17:17:03 -0400 X-MC-Unique: ImLrR4wrObKxC6XmC6N9cQ-1 X-Mimecast-MFC-AGG-ID: ImLrR4wrObKxC6XmC6N9cQ_1758921423 Received: by mail-qk1-f199.google.com with SMTP id af79cd13be357-85dd8633b1bso469035585a.1 for ; Fri, 26 Sep 2025 14:17:03 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1758921423; x=1759526223; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=+hVqOG3Eh1Yx9eqpcK28/avvzNcMaSdpE9ANunBKEMM=; b=dL4Jv5coBYtk7iabapa5fQVKU24c2Auou4KyZnfWbFDhKqKFLC4tLaZ1cFxDzzOadS TSwzehuoLNMVzanHHK1K/TjD5VXGqmDz5yg0Xooz7Oi9xoZyFx41WdPap9qI6eg6SRdE QvWIvj0NdlHPmZVkHA9O4HoJMq84Zl59rQZkEM/zsXrYwtlxJIYwOqJqVpvsR+eYCoi0 gVNAT/OGQGmdtbqw4k42dekUCCJWWu7zgWCboEuezQ33wkMLuIgkz6b+rDOqp+ouUqi/ ywrdAIQVE31TnyW4TuNX4+r6Iw4HlOC1Gw4LzLkbzz1CyWPTa1+MLRlIl39aB5eNuSv4 JrjQ== X-Forwarded-Encrypted: i=1; AJvYcCU7Del+SiuBTC5ZyVSBtzikdEjjjUBLqIm13H5Re9kRqap/8r2wjUV2ltZ6NoA5OiMMUVw/17mTRs3OAG4=@vger.kernel.org X-Gm-Message-State: AOJu0YzHVIGvzVzdttuTsK8Qfl89i2ZTWSiwMoTPtCq9MMtx/RlVcxOP W6VeICZdtdurJCEBzpsGaIQSbZIzK0qolbE6Hj5Ru/c/22u5dbmmbJ1+I9wB5CDp2oDMgCE8Jha nFXchzw6afmoDbgihbFuBYEjpHK0Jr1NNrkIzZto1kApsHypJUBPUc+hdQv+Vdj5IkQ== X-Gm-Gg: ASbGnctXoHigS9XBNdAdIVAgr6P0jON3ClXPqnXDgXkeY3+S4dcec4wGsw+dp1woXxX IO0gxgVDY0xWSc6r69G5DCtLV8uVr3GzLFRzfM1U11H2CAsNUJxKGUz03IfiqlSI5qGNPQSmRaS MrNFWh3iAkFIXI2essjCs/syu/neI0KCsFtxRBuQqYLsh+COC54jRM3pbiWVQ92lVp8GM90xwvN heKPqT/9sU0zuEkYmXgqmRWgiKFW61vEVIQwbGQPDhGJL+S+pvZCnA70k+gY27rtBdDgFMtazaA fwrv4kpbMa9iFKQh2gdZ8TIiPY20/A== X-Received: by 2002:a05:620a:458b:b0:850:78b7:f878 with SMTP id af79cd13be357-864545bb641mr128786485a.0.1758921422707; Fri, 26 Sep 2025 14:17:02 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGYPzTHVPardWHZo24HN+Lu0mmZZ7m+h0eutm0YjD7rOSGFq9rS1SwsPF9Di4dBBVvxySTv0Q== X-Received: by 2002:a05:620a:458b:b0:850:78b7:f878 with SMTP id af79cd13be357-864545bb641mr128781185a.0.1758921422125; Fri, 26 Sep 2025 14:17:02 -0700 (PDT) Received: from x1.com ([142.188.210.50]) by smtp.gmail.com with ESMTPSA id af79cd13be357-86042e32249sm210604785a.44.2025.09.26.14.17.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 26 Sep 2025 14:17:01 -0700 (PDT) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Axel Rasmussen , Vlastimil Babka , James Houghton , Nikita Kalyazin , David Hildenbrand , Lorenzo Stoakes , Ujwal Kundur , Mike Rapoport , Andrew Morton , peterx@redhat.com, Andrea Arcangeli , "Liam R . Howlett" , Michal Hocko , Muchun Song , Oscar Salvador , Hugh Dickins , Suren Baghdasaryan Subject: [PATCH v3 4/4] mm: Apply vm_uffd_ops API to core mm Date: Fri, 26 Sep 2025 17:16:50 -0400 Message-ID: <20250926211650.525109-5-peterx@redhat.com> X-Mailer: git-send-email 2.50.1 In-Reply-To: <20250926211650.525109-1-peterx@redhat.com> References: <20250926211650.525109-1-peterx@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Move userfaultfd core to use new vm_uffd_ops API. After this change file systems that implement vm_operations_struct can start using new API for userfaultfd operations. When at it, moving vma_can_userfault() into mm/userfaultfd.c instead, because it's getting too big. It's only used in slow paths so it shouldn't be an issue. Move the pte marker check before wp_async, which might be more intuitive because wp_async depends on pte markers. That shouldn't cause any functional change though because only one check would take effect depending on whether pte marker was selected in config. This will also remove quite some hard-coded checks for either shmem or hugetlbfs. Now all the old checks should still work but with vm_uffd_ops. Note that anonymous memory will still need to be processed separately because it doesn't have vm_ops at all. Reviewed-by: James Houghton Acked-by: Mike Rapoport Signed-off-by: Peter Xu --- include/linux/userfaultfd_k.h | 46 +++++---------- mm/userfaultfd.c | 102 ++++++++++++++++++++++++++-------- 2 files changed, 91 insertions(+), 57 deletions(-) diff --git a/include/linux/userfaultfd_k.h b/include/linux/userfaultfd_k.h index b1949d8611238..e3704e27376ad 100644 --- a/include/linux/userfaultfd_k.h +++ b/include/linux/userfaultfd_k.h @@ -134,9 +134,14 @@ struct vm_uffd_ops { #define MFILL_ATOMIC_FLAG(nr) ((__force uffd_flags_t) MFILL_ATOMIC_BIT(nr)) #define MFILL_ATOMIC_MODE_MASK ((__force uffd_flags_t) (MFILL_ATOMIC_BIT(0= ) - 1)) =20 +static inline enum mfill_atomic_mode uffd_flags_get_mode(uffd_flags_t flag= s) +{ + return (__force enum mfill_atomic_mode)(flags & MFILL_ATOMIC_MODE_MASK); +} + static inline bool uffd_flags_mode_is(uffd_flags_t flags, enum mfill_atomi= c_mode expected) { - return (flags & MFILL_ATOMIC_MODE_MASK) =3D=3D ((__force uffd_flags_t) ex= pected); + return uffd_flags_get_mode(flags) =3D=3D expected; } =20 static inline uffd_flags_t uffd_flags_set_mode(uffd_flags_t flags, enum mf= ill_atomic_mode mode) @@ -245,41 +250,16 @@ static inline bool userfaultfd_armed(struct vm_area_s= truct *vma) return vma->vm_flags & __VM_UFFD_FLAGS; } =20 -static inline bool vma_can_userfault(struct vm_area_struct *vma, - vm_flags_t vm_flags, - bool wp_async) +static inline const struct vm_uffd_ops *vma_get_uffd_ops(struct vm_area_st= ruct *vma) { - vm_flags &=3D __VM_UFFD_FLAGS; - - if (vma->vm_flags & VM_DROPPABLE) - return false; - - if ((vm_flags & VM_UFFD_MINOR) && - (!is_vm_hugetlb_page(vma) && !vma_is_shmem(vma))) - return false; - - /* - * If wp async enabled, and WP is the only mode enabled, allow any - * memory type. - */ - if (wp_async && (vm_flags =3D=3D VM_UFFD_WP)) - return true; - -#ifndef CONFIG_PTE_MARKER_UFFD_WP - /* - * If user requested uffd-wp but not enabled pte markers for - * uffd-wp, then shmem & hugetlbfs are not supported but only - * anonymous. - */ - if ((vm_flags & VM_UFFD_WP) && !vma_is_anonymous(vma)) - return false; -#endif - - /* By default, allow any of anon|shmem|hugetlb */ - return vma_is_anonymous(vma) || is_vm_hugetlb_page(vma) || - vma_is_shmem(vma); + if (vma->vm_ops && vma->vm_ops->userfaultfd_ops) + return vma->vm_ops->userfaultfd_ops; + return NULL; } =20 +bool vma_can_userfault(struct vm_area_struct *vma, + unsigned long vm_flags, bool wp_async); + static inline bool vma_has_uffd_without_event_remap(struct vm_area_struct = *vma) { struct userfaultfd_ctx *uffd_ctx =3D vma->vm_userfaultfd_ctx.ctx; diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c index af61b95c89e4e..0a863ac123d84 100644 --- a/mm/userfaultfd.c +++ b/mm/userfaultfd.c @@ -20,6 +20,43 @@ #include "internal.h" #include "swap.h" =20 +bool vma_can_userfault(struct vm_area_struct *vma, vm_flags_t vm_flags, + bool wp_async) +{ + unsigned long supported; + + if (vma->vm_flags & VM_DROPPABLE) + return false; + + vm_flags &=3D __VM_UFFD_FLAGS; + +#ifndef CONFIG_PTE_MARKER_UFFD_WP + /* + * If user requested uffd-wp but not enabled pte markers for + * uffd-wp, then any file system (like shmem or hugetlbfs) are not + * supported but only anonymous. + */ + if ((vm_flags & VM_UFFD_WP) && !vma_is_anonymous(vma)) + return false; +#endif + /* + * If wp async enabled, and WP is the only mode enabled, allow any + * memory type. + */ + if (wp_async && (vm_flags =3D=3D VM_UFFD_WP)) + return true; + + if (vma_is_anonymous(vma)) + /* Anonymous has no page cache, MINOR not supported */ + supported =3D VM_UFFD_MISSING | VM_UFFD_WP; + else if (vma_get_uffd_ops(vma)) + supported =3D vma_get_uffd_ops(vma)->uffd_features; + else + return false; + + return !(vm_flags & (~supported)); +} + static __always_inline bool validate_dst_vma(struct vm_area_struct *dst_vma, unsigned long dst_en= d) { @@ -382,13 +419,17 @@ static int mfill_atomic_pte_continue(pmd_t *dst_pmd, unsigned long dst_addr, uffd_flags_t flags) { + const struct vm_uffd_ops *uffd_ops =3D vma_get_uffd_ops(dst_vma); struct inode *inode =3D file_inode(dst_vma->vm_file); pgoff_t pgoff =3D linear_page_index(dst_vma, dst_addr); struct folio *folio; struct page *page; int ret; =20 - ret =3D shmem_get_folio(inode, pgoff, 0, &folio, SGP_NOALLOC); + if (WARN_ON_ONCE(!uffd_ops || !uffd_ops->uffd_get_folio)) + return -EINVAL; + + ret =3D uffd_ops->uffd_get_folio(inode, pgoff, &folio); /* Our caller expects us to return -EFAULT if we failed to find folio */ if (ret =3D=3D -ENOENT) ret =3D -EFAULT; @@ -504,18 +545,6 @@ static __always_inline ssize_t mfill_atomic_hugetlb( u32 hash; struct address_space *mapping; =20 - /* - * There is no default zero huge page for all huge page sizes as - * supported by hugetlb. A PMD_SIZE huge pages may exist as used - * by THP. Since we can not reliably insert a zero page, this - * feature is not supported. - */ - if (uffd_flags_mode_is(flags, MFILL_ATOMIC_ZEROPAGE)) { - up_read(&ctx->map_changing_lock); - uffd_mfill_unlock(dst_vma); - return -EINVAL; - } - src_addr =3D src_start; dst_addr =3D dst_start; copied =3D 0; @@ -694,6 +723,41 @@ static __always_inline ssize_t mfill_atomic_pte(pmd_t = *dst_pmd, return err; } =20 +static inline bool +vma_uffd_ops_supported(struct vm_area_struct *vma, uffd_flags_t flags) +{ + enum mfill_atomic_mode mode =3D uffd_flags_get_mode(flags); + const struct vm_uffd_ops *uffd_ops; + unsigned long uffd_ioctls; + + if ((flags & MFILL_ATOMIC_WP) && !(vma->vm_flags & VM_UFFD_WP)) + return false; + + /* Anonymous supports everything except CONTINUE */ + if (vma_is_anonymous(vma)) + return mode !=3D MFILL_ATOMIC_CONTINUE; + + uffd_ops =3D vma_get_uffd_ops(vma); + if (!uffd_ops) + return false; + + uffd_ioctls =3D uffd_ops->uffd_ioctls; + switch (mode) { + case MFILL_ATOMIC_COPY: + return uffd_ioctls & BIT(_UFFDIO_COPY); + case MFILL_ATOMIC_ZEROPAGE: + return uffd_ioctls & BIT(_UFFDIO_ZEROPAGE); + case MFILL_ATOMIC_CONTINUE: + if (!(vma->vm_flags & VM_SHARED)) + return false; + return uffd_ioctls & BIT(_UFFDIO_CONTINUE); + case MFILL_ATOMIC_POISON: + return uffd_ioctls & BIT(_UFFDIO_POISON); + default: + return false; + } +} + static __always_inline ssize_t mfill_atomic(struct userfaultfd_ctx *ctx, unsigned long dst_start, unsigned long src_start, @@ -752,11 +816,7 @@ static __always_inline ssize_t mfill_atomic(struct use= rfaultfd_ctx *ctx, dst_vma->vm_flags & VM_SHARED)) goto out_unlock; =20 - /* - * validate 'mode' now that we know the dst_vma: don't allow - * a wrprotect copy if the userfaultfd didn't register as WP. - */ - if ((flags & MFILL_ATOMIC_WP) && !(dst_vma->vm_flags & VM_UFFD_WP)) + if (!vma_uffd_ops_supported(dst_vma, flags)) goto out_unlock; =20 /* @@ -766,12 +826,6 @@ static __always_inline ssize_t mfill_atomic(struct use= rfaultfd_ctx *ctx, return mfill_atomic_hugetlb(ctx, dst_vma, dst_start, src_start, len, flags); =20 - if (!vma_is_anonymous(dst_vma) && !vma_is_shmem(dst_vma)) - goto out_unlock; - if (!vma_is_shmem(dst_vma) && - uffd_flags_mode_is(flags, MFILL_ATOMIC_CONTINUE)) - goto out_unlock; - while (src_addr < src_start + len) { pmd_t dst_pmdval; =20 --=20 2.50.1