From nobody Mon Feb 9 08:59:50 2026 Received: from dggsgout12.his.huawei.com (dggsgout12.his.huawei.com [45.249.212.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2B71426B95B; Sat, 20 Dec 2025 04:16:14 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.56 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1766204177; cv=none; b=K4xN2UuSxQbH1UZ+qzXID9OCzBl/gw+BTGd+d2csgdxlJl3AGZo1XnN48e6qUTiUDPtAW2WTUXHGXw1aWuuhLWLh2oxjbVlRoKvgzfLNoAJwDKLRO1AQmLRRpPsXb3EOVe0JiLWvW78rhs6heZ5uQ5TssZsYYP09Xh8W08+BuCw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1766204177; c=relaxed/simple; bh=MBth1yLUg2CLu3tPRgn5EwrWr5zp9XvWm53bJiLMlIQ=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=ixZFjBeYKDlv0VlCyPtZc+GVEOX4yeMbvrZCLiURbKyqJR1Fixd84TxACRyo2wkT0stcpGgMIoQRzqEDV7GFwFU3POPr+JPCqrkTBg7E6mu/PIGc8Ku83iMq4U9PVUqS2z4xtY68w0UTsI2HKByG1InOUDhYjYdJlS1bVtX8bbw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.170]) by dggsgout12.his.huawei.com (SkyGuard) with ESMTPS id 4dYB0T57SdzKHM0L; Sat, 20 Dec 2025 12:15:53 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id D9CB74056F; Sat, 20 Dec 2025 12:16:06 +0800 (CST) Received: from huaweicloud.com (unknown [10.50.87.129]) by APP4 (Coremail) with SMTP id gCh0CgD3WPn5IkZpFwpFAw--.56015S5; Sat, 20 Dec 2025 12:16:06 +0800 (CST) From: Hou Tao To: linux-kernel@vger.kernel.org Cc: linux-pci@vger.kernel.org, linux-mm@kvack.org, linux-nvme@lists.infradead.org, Bjorn Helgaas , Logan Gunthorpe , Alistair Popple , Leon Romanovsky , Greg Kroah-Hartman , Tejun Heo , "Rafael J . Wysocki" , Danilo Krummrich , Andrew Morton , David Hildenbrand , Lorenzo Stoakes , Keith Busch , Jens Axboe , Christoph Hellwig , Sagi Grimberg , houtao1@huawei.com Subject: [PATCH 01/13] PCI/P2PDMA: Release the per-cpu ref of pgmap when vm_insert_page() fails Date: Sat, 20 Dec 2025 12:04:34 +0800 Message-Id: <20251220040446.274991-2-houtao@huaweicloud.com> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20251220040446.274991-1-houtao@huaweicloud.com> References: <20251220040446.274991-1-houtao@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: gCh0CgD3WPn5IkZpFwpFAw--.56015S5 X-Coremail-Antispam: 1UD129KBjvdXoW7Gw4rWF1ktr4DJF45uw4fAFb_yoWDXwb_uF y2vrs7Z3yjkF1vkw1akw1fXryjyas5Wrs29F4rtF93ZFyrurykK3Wjvr98AFy8Gw4YqF98 C342vF1xu347CjkaLaAFLSUrUUUUjb8apTn2vfkv8UJUUUU8Yxn0WfASr-VFAUDa7-sFnT 9fnUUIcSsGvfJTRUUUbq8YFVCjjxCrM7AC8VAFwI0_Wr0E3s1l1xkIjI8I6I8E6xAIw20E Y4v20xvaj40_Wr0E3s1l1IIY67AEw4v_Jr0_Jr4l82xGYIkIc2x26280x7IE14v26r18M2 8IrcIa0xkI8VCY1x0267AKxVW8JVW5JwA2ocxC64kIII0Yj41l84x0c7CEw4AK67xGY2AK 021l84ACjcxK6xIIjxv20xvE14v26F1j6w1UM28EF7xvwVC0I7IYx2IY6xkF7I0E14v26F 4j6r4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0 oVCq3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7V C0I7IYx2IY67AKxVWUAVWUtwAv7VC2z280aVAFwI0_Gr1j6F4UJwAm72CE4IkC6x0Yz7v_ Jr0_Gr1lF7xvr2IYc2Ij64vIr41lFIxGxcIEc7CjxVA2Y2ka0xkIwI1lc7CjxVAaw2AFwI 0_GFv_Wryl42xK82IYc2Ij64vIr41l4I8I3I0E4IkC6x0Yz7v_Jr0_Gr1lx2IqxVAqx4xG 67AKxVWUJVWUGwC20s026x8GjcxK67AKxVWUGVWUWwC2zVAF1VAY17CE14v26r4a6rW5MI IYrxkI7VAKI48JMIIF0xvE2Ix0cI8IcVAFwI0_Jr0_JF4lIxAIcVC0I7IYx2IY6xkF7I0E 14v26r4j6F4UMIIF0xvE42xK8VAvwI8IcIk0rVWUJVWUCwCI42IY6I8E87Iv67AKxVWUJV W8JwCI42IY6I8E87Iv6xkF7I0E14v26r4j6r4UJbIYCTnIWIevJa73UjIFyTuYvjxUVpVb DUUUU X-CM-SenderInfo: xkrx3t3r6k3tpzhluzxrxghudrp/ Content-Type: text/plain; charset="utf-8" From: Hou Tao When vm_insert_page() fails in p2pmem_alloc_mmap(), p2pmem_alloc_mmap() doesn't invoke percpu_ref_put() to free the per-cpu ref of pgmap acquired after gen_pool_alloc_owner(), and memunmap_pages() will hang forever when trying to remove the PCIe device. Fix it by adding the missed percpu_ref_put(). Fixes: 7e9c7ef83d78 ("PCI/P2PDMA: Allow userspace VMA allocations through s= ysfs") Signed-off-by: Hou Tao Reviewed-by: Alistair Popple Reviewed-by: Logan Gunthorpe --- drivers/pci/p2pdma.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/pci/p2pdma.c b/drivers/pci/p2pdma.c index 4a2fc7ab42c3..218c1f5252b6 100644 --- a/drivers/pci/p2pdma.c +++ b/drivers/pci/p2pdma.c @@ -152,6 +152,7 @@ static int p2pmem_alloc_mmap(struct file *filp, struct = kobject *kobj, ret =3D vm_insert_page(vma, vaddr, page); if (ret) { gen_pool_free(p2pdma->pool, (uintptr_t)kaddr, len); + percpu_ref_put(ref); return ret; } percpu_ref_get(ref); --=20 2.29.2 From nobody Mon Feb 9 08:59:50 2026 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0852D2253EE; Sat, 20 Dec 2025 04:16:14 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1766204177; cv=none; b=tZGNUScEuq/x9T6TvgbNbcN5lkJ/WATzLIEM6tE9fbI5X+Z+ldOzlHh5c0gXHwuDZz0SEx5fkwhsitaraXAYiNqdWMO5bgg6C0O8mw+Q/2WegJEvIiVDHZvvQLNQM6ew3Q4wjAEuIE2NpCEd3VLhJvY8pQFX2OdxpwqTZ6Gy8aU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1766204177; c=relaxed/simple; bh=VGvm/RBAbHOoCwU8KD8HI6uHjmNUhOI+disxR/Bn6Gs=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=p5gE1vMFdqsGVdUcI8WX/rdZhks03Qi4qvZP+P8UcNElYLgEhHwV6YCNlrEsEqGPVtGMFjAEJGjNggRGBTYwSUfnMAECHy9SHg74M+hUGcb9kw4pNWdLxrphWsYGGEIdOw559Vdjixu5mTtqpl0gU4muc2kjyhbat99jRTMSFlQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.198]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTPS id 4dYB074NMTzYQtf3; Sat, 20 Dec 2025 12:15:35 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id E928640577; Sat, 20 Dec 2025 12:16:06 +0800 (CST) Received: from huaweicloud.com (unknown [10.50.87.129]) by APP4 (Coremail) with SMTP id gCh0CgD3WPn5IkZpFwpFAw--.56015S6; Sat, 20 Dec 2025 12:16:06 +0800 (CST) From: Hou Tao To: linux-kernel@vger.kernel.org Cc: linux-pci@vger.kernel.org, linux-mm@kvack.org, linux-nvme@lists.infradead.org, Bjorn Helgaas , Logan Gunthorpe , Alistair Popple , Leon Romanovsky , Greg Kroah-Hartman , Tejun Heo , "Rafael J . Wysocki" , Danilo Krummrich , Andrew Morton , David Hildenbrand , Lorenzo Stoakes , Keith Busch , Jens Axboe , Christoph Hellwig , Sagi Grimberg , houtao1@huawei.com Subject: [PATCH 02/13] PCI/P2PDMA: Fix the warning condition in p2pmem_alloc_mmap() Date: Sat, 20 Dec 2025 12:04:35 +0800 Message-Id: <20251220040446.274991-3-houtao@huaweicloud.com> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20251220040446.274991-1-houtao@huaweicloud.com> References: <20251220040446.274991-1-houtao@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: gCh0CgD3WPn5IkZpFwpFAw--.56015S6 X-Coremail-Antispam: 1UD129KBjvJXoW7Zr18GF15tF13WF1DAr1DWrg_yoW8Ar13pF W3GF17GrWktr43JF4DJ3Z8ury5Aan3uFWUKFWUXw1UuF13urWjkw48tF95Jry5try8Ar1S qFs8ur1jyF1DAaUanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUPab4IE77IF4wAFF20E14v26rWj6s0DM7CY07I20VC2zVCF04k2 6cxKx2IYs7xG6rWj6s0DM7CIcVAFz4kK6r1j6r18M28IrcIa0xkI8VA2jI8067AKxVWUXw A2048vs2IY020Ec7CjxVAFwI0_Xr0E3s1l8cAvFVAK0II2c7xJM28CjxkF64kEwVA0rcxS w2x7M28EF7xvwVC0I7IYx2IY67AKxVW7JVWDJwA2z4x0Y4vE2Ix0cI8IcVCY1x0267AKxV W8Jr0_Cr1UM28EF7xvwVC2z280aVAFwI0_GcCE3s1l84ACjcxK6I8E87Iv6xkF7I0E14v2 6rxl6s0DM2AIxVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40Ex7xfMc Ij6xIIjxv20xvE14v26r126r1DMcIj6I8E87Iv67AKxVW8Jr0_Cr1UMcvjeVCFs4IE7xkE bVWUJVW8JwACjcxG0xvY0x0EwIxGrwACI402YVCY1x02628vn2kIc2xKxwCY1x0262kKe7 AKxVW8ZVWrXwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkEbVWUJVW8JwC20s026c02 F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67AF67kF1VAFwI0_GFv_Wr ylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUJVWUCwCI42IY6xIIjxv20xvEc7Cj xVAFwI0_Cr0_Gr1UMIIF0xvE42xK8VAvwI8IcIk0rVWUJVWUCwCI42IY6I8E87Iv67AKxV WUJVW8JwCI42IY6I8E87Iv6xkF7I0E14v26r4j6r4UJbIYCTnIWIevJa73UjIFyTuYvjxU xXTmDUUUU X-CM-SenderInfo: xkrx3t3r6k3tpzhluzxrxghudrp/ Content-Type: text/plain; charset="utf-8" From: Hou Tao Commit b7e282378773 has already changed the initial page refcount of p2pdma page from one to zero, however, in p2pmem_alloc_mmap() it uses "VM_WARN_ON_ONCE_PAGE(!page_ref_count(page))" to assert the initial page refcount should not be zero and the following will be reported when CONFIG_DEBUG_VM is enabled: page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x38040= 0000 flags: 0x20000000002000(reserved|node=3D0|zone=3D4) raw: 0020000000002000 ff1100015e3ab440 0000000000000000 0000000000000000 raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000 page dumped because: VM_WARN_ON_ONCE_PAGE(!page_ref_count(page)) ------------[ cut here ]------------ WARNING: CPU: 5 PID: 449 at drivers/pci/p2pdma.c:240 p2pmem_alloc_mmap+0x8= 3a/0xa60 Fix by using "page_ref_count(page)" as the assertion condition. Fixes: b7e282378773 ("mm/mm_init: move p2pdma page refcount initialisation = to p2pdma") Signed-off-by: Hou Tao Reviewed-by: Alistair Popple Reviewed-by: Logan Gunthorpe --- drivers/pci/p2pdma.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/pci/p2pdma.c b/drivers/pci/p2pdma.c index 218c1f5252b6..dd64ec830fdd 100644 --- a/drivers/pci/p2pdma.c +++ b/drivers/pci/p2pdma.c @@ -147,7 +147,7 @@ static int p2pmem_alloc_mmap(struct file *filp, struct = kobject *kobj, * we have just allocated the page no one else should be * using it. */ - VM_WARN_ON_ONCE_PAGE(!page_ref_count(page), page); + VM_WARN_ON_ONCE_PAGE(page_ref_count(page), page); set_page_count(page, 1); ret =3D vm_insert_page(vma, vaddr, page); if (ret) { --=20 2.29.2 From nobody Mon Feb 9 08:59:50 2026 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 084BC21E098; Sat, 20 Dec 2025 04:16:14 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1766204178; cv=none; b=B0f4IRH/OpFp7ppuABcZ8D6XiPFtk/m1wuVZsD1LXPxSUVJyPb7KHr2wbxWtaHnXiyDUEMIeS+ll7WrqABEn/zRBUYcNAJsYMQ6/VR8t1rJR0gSl0Ds8ahv9XfjX/bLy6wvZKa+tP4oUDtrVy38abWTWrtsttgwf1GnX13Jbk8g= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1766204178; c=relaxed/simple; bh=iKNLXXr4e+8/aHGAThSafQxMLjwW7L2hihK8UhLhUxY=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=b8yu5Mi9Tljfx2jRmIVcyqDboYULrWXlAs+duHXbUPcdx/C44deHfvo95s+WCy1YfXPk4q6gwJ7FfRpM8MGwKHAIvsnsshFIxSBNTJkacdquPLZ0h6Oht3FqKi90KUDwLxnZ+S684m0vRJ0c0JXUW5fjCWnhw2OIRK47TF37K+M= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.198]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTPS id 4dYB0756gRzYQtfx; Sat, 20 Dec 2025 12:15:35 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id 11EB440574; Sat, 20 Dec 2025 12:16:07 +0800 (CST) Received: from huaweicloud.com (unknown [10.50.87.129]) by APP4 (Coremail) with SMTP id gCh0CgD3WPn5IkZpFwpFAw--.56015S7; Sat, 20 Dec 2025 12:16:06 +0800 (CST) From: Hou Tao To: linux-kernel@vger.kernel.org Cc: linux-pci@vger.kernel.org, linux-mm@kvack.org, linux-nvme@lists.infradead.org, Bjorn Helgaas , Logan Gunthorpe , Alistair Popple , Leon Romanovsky , Greg Kroah-Hartman , Tejun Heo , "Rafael J . Wysocki" , Danilo Krummrich , Andrew Morton , David Hildenbrand , Lorenzo Stoakes , Keith Busch , Jens Axboe , Christoph Hellwig , Sagi Grimberg , houtao1@huawei.com Subject: [PATCH 03/13] kernfs: add support for get_unmapped_area callback Date: Sat, 20 Dec 2025 12:04:36 +0800 Message-Id: <20251220040446.274991-4-houtao@huaweicloud.com> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20251220040446.274991-1-houtao@huaweicloud.com> References: <20251220040446.274991-1-houtao@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: gCh0CgD3WPn5IkZpFwpFAw--.56015S7 X-Coremail-Antispam: 1UD129KBjvJXoWxArWDZr4ftw1DCry8urWrGrg_yoW5CrW8pF s5KryYqrWxJr92kr13JF95Zr1av3s7Ka42va4Ig3sYyw1jqFnxXr4Y9F98Gr1rX34rAw42 yanFgayYkw45JrUanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUPab4IE77IF4wAFF20E14v26rWj6s0DM7CY07I20VC2zVCF04k2 6cxKx2IYs7xG6rWj6s0DM7CIcVAFz4kK6r1j6r18M28IrcIa0xkI8VA2jI8067AKxVWUWw A2048vs2IY020Ec7CjxVAFwI0_Xr0E3s1l8cAvFVAK0II2c7xJM28CjxkF64kEwVA0rcxS w2x7M28EF7xvwVC0I7IYx2IY67AKxVW7JVWDJwA2z4x0Y4vE2Ix0cI8IcVCY1x0267AKxV W8Jr0_Cr1UM28EF7xvwVC2z280aVAFwI0_GcCE3s1l84ACjcxK6I8E87Iv6xkF7I0E14v2 6rxl6s0DM2AIxVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40Ex7xfMc Ij6xIIjxv20xvE14v26r126r1DMcIj6I8E87Iv67AKxVW8Jr0_Cr1UMcvjeVCFs4IE7xkE bVWUJVW8JwACjcxG0xvY0x0EwIxGrwACI402YVCY1x02628vn2kIc2xKxwCY1x0262kKe7 AKxVW8ZVWrXwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkEbVWUJVW8JwC20s026c02 F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67AF67kF1VAFwI0_GFv_Wr ylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUJVWUCwCI42IY6xIIjxv20xvEc7Cj xVAFwI0_Cr0_Gr1UMIIF0xvE42xK8VAvwI8IcIk0rVWUJVWUCwCI42IY6I8E87Iv67AKxV WUJVW8JwCI42IY6I8E87Iv6xkF7I0E14v26r4j6r4UJbIYCTnIWIevJa73UjIFyTuYvjxU xQzVUUUUU X-CM-SenderInfo: xkrx3t3r6k3tpzhluzxrxghudrp/ Content-Type: text/plain; charset="utf-8" From: Hou Tao kernfs has already support ->mmap callback, however it doesn't support ->get_unmapped_area callback to return PMD-aligned or PUD-aligned virtual address. The following patch will need it to support compound page for p2pdma device memory, therefore add the necessary support for it. When the ->get_unmapped_area callback is not defined in kernfs_ops or the callback returns -EOPNOTSUPP, kernfs_get_unmapped_area() will fallback to mm_get_unmapped_area(). Signed-off-by: Hou Tao --- fs/kernfs/file.c | 37 +++++++++++++++++++++++++++++++++++++ include/linux/kernfs.h | 3 +++ 2 files changed, 40 insertions(+) diff --git a/fs/kernfs/file.c b/fs/kernfs/file.c index 9adf36e6364b..9773b5734a2c 100644 --- a/fs/kernfs/file.c +++ b/fs/kernfs/file.c @@ -454,6 +454,39 @@ static const struct vm_operations_struct kernfs_vm_ops= =3D { .access =3D kernfs_vma_access, }; =20 +static unsigned long kernfs_get_unmapped_area(struct file *file, unsigned = long uaddr, + unsigned long len, unsigned long pgoff, + unsigned long flags) +{ + struct kernfs_open_file *of =3D kernfs_of(file); + const struct kernfs_ops *ops; + long addr; + + if (!(of->kn->flags & KERNFS_HAS_MMAP)) + return -ENODEV; + + mutex_lock(&of->mutex); + + addr =3D -ENODEV; + if (!kernfs_get_active_of(of)) + goto out_unlock; + + ops =3D kernfs_ops(of->kn); + if (ops->get_unmapped_area) { + addr =3D ops->get_unmapped_area(of, uaddr, len, pgoff, flags); + if (!IS_ERR_VALUE(addr) || addr !=3D -EOPNOTSUPP) + goto out_put; + } + addr =3D mm_get_unmapped_area(file, uaddr, len, pgoff, flags); + +out_put: + kernfs_put_active_of(of); +out_unlock: + mutex_unlock(&of->mutex); + + return addr; +} + static int kernfs_fop_mmap(struct file *file, struct vm_area_struct *vma) { struct kernfs_open_file *of =3D kernfs_of(file); @@ -1017,6 +1050,7 @@ const struct file_operations kernfs_file_fops =3D { .write_iter =3D kernfs_fop_write_iter, .llseek =3D kernfs_fop_llseek, .mmap =3D kernfs_fop_mmap, + .get_unmapped_area =3D kernfs_get_unmapped_area, .open =3D kernfs_fop_open, .release =3D kernfs_fop_release, .poll =3D kernfs_fop_poll, @@ -1052,6 +1086,9 @@ struct kernfs_node *__kernfs_create_file(struct kernf= s_node *parent, unsigned flags; int rc; =20 + if (ops->get_unmapped_area && !ops->mmap) + return ERR_PTR(-EINVAL); + flags =3D KERNFS_FILE; =20 kn =3D kernfs_new_node(parent, name, (mode & S_IALLUGO) | S_IFREG, diff --git a/include/linux/kernfs.h b/include/linux/kernfs.h index b5a5f32fdfd1..9467b0a2b339 100644 --- a/include/linux/kernfs.h +++ b/include/linux/kernfs.h @@ -324,6 +324,9 @@ struct kernfs_ops { =20 int (*mmap)(struct kernfs_open_file *of, struct vm_area_struct *vma); loff_t (*llseek)(struct kernfs_open_file *of, loff_t offset, int whence); + unsigned long (*get_unmapped_area)(struct kernfs_open_file *of, unsigned = long uaddr, + unsigned long len, unsigned long pgoff, + unsigned long flags); }; =20 /* --=20 2.29.2 From nobody Mon Feb 9 08:59:50 2026 Received: from dggsgout12.his.huawei.com (dggsgout12.his.huawei.com [45.249.212.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2B6A8267AF2; Sat, 20 Dec 2025 04:16:14 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.56 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1766204177; cv=none; b=orp2Oe/V+bARclxf1Mdb6S1SYHsYG9dOJfmGGDXbtsXEUME2q7UoBDEOkm551L0u/O0iBzvkJUDeaSE6O+9hf+So3n5KWuz77B00ndPtoj2edhTEQHXIRhWg7fZnliGP9ojmh2+t19zrEju4Sg5illork2vpIGvHalJhC2WXD+E= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1766204177; c=relaxed/simple; bh=ve2nmuCWqBeqQS8gTDo/2/9uOEc2HWO4Q/jsupTDFkM=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=mgQGvB+C2lVCehbWLRTzMhNOu7xqeROqA7L0N1/7OV9BIQ0qgQxuRZBA/29TVznC73F99vdNzhrkYE9s1j7hd7PmF4DtOTUhzo43s4m6dhp91ffKgLiUxWPJxyQCF02e5GjqUH71JHFWpI/9bFKbd9CRtuSNVXbSsm8DWkpD9OU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=none smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.177]) by dggsgout12.his.huawei.com (SkyGuard) with ESMTPS id 4dYB0T6zvrzKHMKK; Sat, 20 Dec 2025 12:15:53 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id 25DAA4058A; Sat, 20 Dec 2025 12:16:07 +0800 (CST) Received: from huaweicloud.com (unknown [10.50.87.129]) by APP4 (Coremail) with SMTP id gCh0CgD3WPn5IkZpFwpFAw--.56015S8; Sat, 20 Dec 2025 12:16:06 +0800 (CST) From: Hou Tao To: linux-kernel@vger.kernel.org Cc: linux-pci@vger.kernel.org, linux-mm@kvack.org, linux-nvme@lists.infradead.org, Bjorn Helgaas , Logan Gunthorpe , Alistair Popple , Leon Romanovsky , Greg Kroah-Hartman , Tejun Heo , "Rafael J . Wysocki" , Danilo Krummrich , Andrew Morton , David Hildenbrand , Lorenzo Stoakes , Keith Busch , Jens Axboe , Christoph Hellwig , Sagi Grimberg , houtao1@huawei.com Subject: [PATCH 04/13] kernfs: add support for may_split and pagesize callbacks Date: Sat, 20 Dec 2025 12:04:37 +0800 Message-Id: <20251220040446.274991-5-houtao@huaweicloud.com> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20251220040446.274991-1-houtao@huaweicloud.com> References: <20251220040446.274991-1-houtao@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: gCh0CgD3WPn5IkZpFwpFAw--.56015S8 X-Coremail-Antispam: 1UD129KBjvJXoW7KrWUCw4fWFyfZw15Xr17GFg_yoW8KF15pF 4fKw15XrW8W3s3Cr9xXF4kZ343t3s7Gay7X34fu3sYy3W3tFnIvFyFgr98Ar15AryrJr4f tw42yryYka15Ww7anT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUPvb4IE77IF4wAFF20E14v26rWj6s0DM7CY07I20VC2zVCF04k2 6cxKx2IYs7xG6rWj6s0DM7CIcVAFz4kK6r1j6r18M28IrcIa0xkI8VA2jI8067AKxVWUAV Cq3wA2048vs2IY020Ec7CjxVAFwI0_Xr0E3s1l8cAvFVAK0II2c7xJM28CjxkF64kEwVA0 rcxSw2x7M28EF7xvwVC0I7IYx2IY67AKxVW7JVWDJwA2z4x0Y4vE2Ix0cI8IcVCY1x0267 AKxVW8Jr0_Cr1UM28EF7xvwVC2z280aVAFwI0_GcCE3s1l84ACjcxK6I8E87Iv6xkF7I0E 14v26rxl6s0DM2AIxVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40Ex7 xfMcIj6xIIjxv20xvE14v26r126r1DMcIj6I8E87Iv67AKxVW8Jr0_Cr1UMcvjeVCFs4IE 7xkEbVWUJVW8JwACjcxG0xvY0x0EwIxGrwACI402YVCY1x02628vn2kIc2xKxwCY1x0262 kKe7AKxVW8ZVWrXwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkEbVWUJVW8JwC20s02 6c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67AF67kF1VAFwI0_GF v_WrylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUJVWUCwCI42IY6xIIjxv20xvE c7CjxVAFwI0_Cr0_Gr1UMIIF0xvE42xK8VAvwI8IcIk0rVWUJVWUCwCI42IY6I8E87Iv67 AKxVWUJVW8JwCI42IY6I8E87Iv6xkF7I0E14v26r4j6r4UJbIYCTnIWIevJa73UjIFyTuY vjxUxubkUUUUU X-CM-SenderInfo: xkrx3t3r6k3tpzhluzxrxghudrp/ Content-Type: text/plain; charset="utf-8" From: Hou Tao ->may_split() and ->pagesize() callbacks are necessary for the support of compound page. ->may_split() will check whether the splitting of compound page is allowed during mprotect or remap, and ->pagesize() will output the correct page size in /proc/${pid}/smap file. These two callbacks will be used by the following patch to enable the mapping of compound page of p2pdma memory into userspace, therefore, add the support for these two callbacks. Signed-off-by: Hou Tao --- fs/kernfs/file.c | 42 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 42 insertions(+) diff --git a/fs/kernfs/file.c b/fs/kernfs/file.c index 9773b5734a2c..5df45b1dbb36 100644 --- a/fs/kernfs/file.c +++ b/fs/kernfs/file.c @@ -384,6 +384,46 @@ static void kernfs_vma_open(struct vm_area_struct *vma) kernfs_put_active_of(of); } =20 +static int kernfs_vma_may_split(struct vm_area_struct *vma, unsigned long = addr) +{ + struct file *file =3D vma->vm_file; + struct kernfs_open_file *of =3D kernfs_of(file); + int ret; + + if (!of->vm_ops) + return 0; + + if (!kernfs_get_active_of(of)) + return -ENODEV; + + ret =3D 0; + if (of->vm_ops->may_split) + ret =3D of->vm_ops->may_split(vma, addr); + + kernfs_put_active_of(of); + return ret; +} + +static unsigned long kernfs_vma_pagesize(struct vm_area_struct *vma) +{ + struct file *file =3D vma->vm_file; + struct kernfs_open_file *of =3D kernfs_of(file); + unsigned long size; + + if (!of->vm_ops) + return PAGE_SIZE; + + if (!kernfs_get_active_of(of)) + return PAGE_SIZE; + + size =3D PAGE_SIZE; + if (of->vm_ops->pagesize) + size =3D of->vm_ops->pagesize(vma); + + kernfs_put_active_of(of); + return size; +} + static vm_fault_t kernfs_vma_fault(struct vm_fault *vmf) { struct file *file =3D vmf->vma->vm_file; @@ -449,9 +489,11 @@ static int kernfs_vma_access(struct vm_area_struct *vm= a, unsigned long addr, =20 static const struct vm_operations_struct kernfs_vm_ops =3D { .open =3D kernfs_vma_open, + .may_split =3D kernfs_vma_may_split, .fault =3D kernfs_vma_fault, .page_mkwrite =3D kernfs_vma_page_mkwrite, .access =3D kernfs_vma_access, + .pagesize =3D kernfs_vma_pagesize, }; =20 static unsigned long kernfs_get_unmapped_area(struct file *file, unsigned = long uaddr, --=20 2.29.2 From nobody Mon Feb 9 08:59:50 2026 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 60DF9275AEB; Sat, 20 Dec 2025 04:16:15 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1766204177; cv=none; b=ARJuuwLgvwvodhV/VyuSypCuAvxJKK90knO7S50hepXq/IjZc28tbr75A8CqDf/bvXnTOFf15L9qXF+IEKWTp4KVhepDpLxjc46uz5JEyfZcja9Rh8b6L29Gh0EIfrgrfPgoUM2jy0xDns9qi4d3OjDmlm6JlUrh/KBIyyEr/1U= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1766204177; c=relaxed/simple; bh=9vCTTJw9IQZQdFTnf+iBa15xvCU0d4qAo3nVeWr1T/M=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=fGFJ+GO5Gk/mie+zheRHSl2HyBqNXa9h0iv7IK8JvH9ozGkRnvNsEc5wmPyOuNethYV+NH0674P3fuQi8jw/gzaxDQ0KybGPJd2+O8UeUZMsd1AoCj7Ky6ZcuvHCf/0fiOUAV261keRoRDWQFUbsZJb7ebfvpU4eS+orkBpSZD4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.198]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTPS id 4dYB076fLpzYQtgv; Sat, 20 Dec 2025 12:15:35 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id 424F240575; Sat, 20 Dec 2025 12:16:07 +0800 (CST) Received: from huaweicloud.com (unknown [10.50.87.129]) by APP4 (Coremail) with SMTP id gCh0CgD3WPn5IkZpFwpFAw--.56015S9; Sat, 20 Dec 2025 12:16:06 +0800 (CST) From: Hou Tao To: linux-kernel@vger.kernel.org Cc: linux-pci@vger.kernel.org, linux-mm@kvack.org, linux-nvme@lists.infradead.org, Bjorn Helgaas , Logan Gunthorpe , Alistair Popple , Leon Romanovsky , Greg Kroah-Hartman , Tejun Heo , "Rafael J . Wysocki" , Danilo Krummrich , Andrew Morton , David Hildenbrand , Lorenzo Stoakes , Keith Busch , Jens Axboe , Christoph Hellwig , Sagi Grimberg , houtao1@huawei.com Subject: [PATCH 05/13] sysfs: support get_unmapped_area callback for binary file Date: Sat, 20 Dec 2025 12:04:38 +0800 Message-Id: <20251220040446.274991-6-houtao@huaweicloud.com> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20251220040446.274991-1-houtao@huaweicloud.com> References: <20251220040446.274991-1-houtao@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: gCh0CgD3WPn5IkZpFwpFAw--.56015S9 X-Coremail-Antispam: 1UD129KBjvJXoW7uF48Xw1DWw45tr1rXFy8AFb_yoW8uF4rpF sYk343XrZ7G3srKr9xAFW5W343KwnrGr1q9rZ2g343AwnrtF9Ig3yjya15JryrJrW8Cr1x KF429ryYkrW5G37anT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUPlb4IE77IF4wAFF20E14v26rWj6s0DM7CY07I20VC2zVCF04k2 6cxKx2IYs7xG6rWj6s0DM7CIcVAFz4kK6r1j6r18M28IrcIa0xkI8VA2jI8067AKxVWUAV Cq3wA2048vs2IY020Ec7CjxVAFwI0_Xr0E3s1l8cAvFVAK0II2c7xJM28CjxkF64kEwVA0 rcxSw2x7M28EF7xvwVC0I7IYx2IY67AKxVW7JVWDJwA2z4x0Y4vE2Ix0cI8IcVCY1x0267 AKxVW8Jr0_Cr1UM28EF7xvwVC2z280aVAFwI0_GcCE3s1l84ACjcxK6I8E87Iv6xkF7I0E 14v26rxl6s0DM2AIxVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40Ex7 xfMcIj6xIIjxv20xvE14v26r126r1DMcIj6I8E87Iv67AKxVW8Jr0_Cr1UMcvjeVCFs4IE 7xkEbVWUJVW8JwACjcxG0xvY0x0EwIxGrwACI402YVCY1x02628vn2kIc2xKxwCY1x0262 kKe7AKxVW8ZVWrXwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkEbVWUJVW8JwC20s02 6c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67AF67kF1VAFwI0_GF v_WrylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUCVW8JwCI42IY6xIIjxv20xvE c7CjxVAFwI0_Gr1j6F4UJwCI42IY6xAIw20EY4v20xvaj40_Jr0_JF4lIxAIcVC2z280aV AFwI0_Jr0_Gr1lIxAIcVC2z280aVCY1x0267AKxVW8Jr0_Cr1UYxBIdaVFxhVjvjDU0xZF pf9x07j4fO7UUUUU= X-CM-SenderInfo: xkrx3t3r6k3tpzhluzxrxghudrp/ Content-Type: text/plain; charset="utf-8" From: Hou Tao Add support for ->get_unmapped_area callback for binary sysfs file. The following patch will use it to support compound page for p2pdma device memory when the device memory is mapped into userspace. Signed-off-by: Hou Tao --- fs/sysfs/file.c | 15 +++++++++++++++ include/linux/sysfs.h | 4 ++++ 2 files changed, 19 insertions(+) diff --git a/fs/sysfs/file.c b/fs/sysfs/file.c index 3825e780cc58..e843795ebdc2 100644 --- a/fs/sysfs/file.c +++ b/fs/sysfs/file.c @@ -164,6 +164,20 @@ static ssize_t sysfs_kf_bin_write(struct kernfs_open_f= ile *of, char *buf, return battr->write(of->file, kobj, battr, buf, pos, count); } =20 +static unsigned long sysfs_kf_bin_get_unmapped_area(struct kernfs_open_fil= e *of, + unsigned long uaddr, unsigned long len, + unsigned long pgoff, unsigned long flags) +{ + const struct bin_attribute *battr =3D of->kn->priv; + struct kobject *kobj; + + if (!battr->get_unmapped_area) + return -EOPNOTSUPP; + + kobj =3D sysfs_file_kobj(of->kn); + return battr->get_unmapped_area(of->file, kobj, battr, uaddr, len, pgoff,= flags); +} + static int sysfs_kf_bin_mmap(struct kernfs_open_file *of, struct vm_area_struct *vma) { @@ -268,6 +282,7 @@ static const struct kernfs_ops sysfs_bin_kfops_mmap =3D= { .mmap =3D sysfs_kf_bin_mmap, .open =3D sysfs_kf_bin_open, .llseek =3D sysfs_kf_bin_llseek, + .get_unmapped_area =3D sysfs_kf_bin_get_unmapped_area, }; =20 int sysfs_add_file_mode_ns(struct kernfs_node *parent, diff --git a/include/linux/sysfs.h b/include/linux/sysfs.h index c33a96b7391a..f4a50f244f4d 100644 --- a/include/linux/sysfs.h +++ b/include/linux/sysfs.h @@ -321,6 +321,10 @@ struct bin_attribute { loff_t, int); int (*mmap)(struct file *, struct kobject *, const struct bin_attribute *= attr, struct vm_area_struct *vma); + unsigned long (*get_unmapped_area)(struct file *, struct kobject *, + const struct bin_attribute *attr, + unsigned long uaddr, unsigned long len, + unsigned long pgoff, unsigned long flags); }; =20 /** --=20 2.29.2 From nobody Mon Feb 9 08:59:50 2026 Received: from dggsgout12.his.huawei.com (dggsgout12.his.huawei.com [45.249.212.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B5DE0288537; Sat, 20 Dec 2025 04:16:15 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.56 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1766204177; cv=none; b=PEMzv3epLF7m7CPPSHfIaKMT6wrxVHSTQ+dDzYtpTG3B9zqtCHD/DL5HXLV6jvMvbC4Yk+woMMNn017KzTMLunW4uaNFXT9oaoADwEb08d6tEzninJOkgxbLTKd8TUxvMHNwKCDtr5Iul2LFvTtxA4awcs2HIV1kvgvgQO8vNnw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1766204177; c=relaxed/simple; bh=MWPqLIbDIy2eCFmEoAxxLAkBRkEpL23GYIkxYSqXz/k=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=X3pxP5J27LedQ+Ocj9P34v++JRvvnengaeNfVulWpYJ+Qg34/+Vl/SmFMYgY5wlA5kBmaSC1lNQZtlonRw2Hqwq4rwne7oAiFoiYJWfxf3uQgTzxAeHit2dZRYYNdeP/b949GUqBcL1mPYY66S9Ak3tAxTzoVrCZ0g7Mqfg84hE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=none smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.198]) by dggsgout12.his.huawei.com (SkyGuard) with ESMTPS id 4dYB0V1HXlzKHMKg; Sat, 20 Dec 2025 12:15:54 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id 537AF40574; Sat, 20 Dec 2025 12:16:07 +0800 (CST) Received: from huaweicloud.com (unknown [10.50.87.129]) by APP4 (Coremail) with SMTP id gCh0CgD3WPn5IkZpFwpFAw--.56015S10; Sat, 20 Dec 2025 12:16:07 +0800 (CST) From: Hou Tao To: linux-kernel@vger.kernel.org Cc: linux-pci@vger.kernel.org, linux-mm@kvack.org, linux-nvme@lists.infradead.org, Bjorn Helgaas , Logan Gunthorpe , Alistair Popple , Leon Romanovsky , Greg Kroah-Hartman , Tejun Heo , "Rafael J . Wysocki" , Danilo Krummrich , Andrew Morton , David Hildenbrand , Lorenzo Stoakes , Keith Busch , Jens Axboe , Christoph Hellwig , Sagi Grimberg , houtao1@huawei.com Subject: [PATCH 06/13] PCI/P2PDMA: add align parameter for pci_p2pdma_add_resource() Date: Sat, 20 Dec 2025 12:04:39 +0800 Message-Id: <20251220040446.274991-7-houtao@huaweicloud.com> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20251220040446.274991-1-houtao@huaweicloud.com> References: <20251220040446.274991-1-houtao@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: gCh0CgD3WPn5IkZpFwpFAw--.56015S10 X-Coremail-Antispam: 1UD129KBjvJXoWxtryrCF4xXw1kZrW3GFy5CFg_yoW7uF45pF yrA3WDGrW8Ka17J34UJa1DAF1Fvwnag34IkrW7Cws3Za43trs5tF1UCFyjkF1fGrWkC3W5 tFWjyr1ruw1UJF7anT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUPlb4IE77IF4wAFF20E14v26rWj6s0DM7CY07I20VC2zVCF04k2 6cxKx2IYs7xG6rWj6s0DM7CIcVAFz4kK6r1j6r18M28IrcIa0xkI8VA2jI8067AKxVWUAV Cq3wA2048vs2IY020Ec7CjxVAFwI0_Xr0E3s1l8cAvFVAK0II2c7xJM28CjxkF64kEwVA0 rcxSw2x7M28EF7xvwVC0I7IYx2IY67AKxVW7JVWDJwA2z4x0Y4vE2Ix0cI8IcVCY1x0267 AKxVW8Jr0_Cr1UM28EF7xvwVC2z280aVAFwI0_GcCE3s1l84ACjcxK6I8E87Iv6xkF7I0E 14v26rxl6s0DM2AIxVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40Ex7 xfMcIj6xIIjxv20xvE14v26r126r1DMcIj6I8E87Iv67AKxVW8Jr0_Cr1UMcvjeVCFs4IE 7xkEbVWUJVW8JwACjcxG0xvY0x0EwIxGrwACI402YVCY1x02628vn2kIc2xKxwCY1x0262 kKe7AKxVW8ZVWrXwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkEbVWUJVW8JwC20s02 6c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67AF67kF1VAFwI0_GF v_WrylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUCVW8JwCI42IY6xIIjxv20xvE c7CjxVAFwI0_Gr1j6F4UJwCI42IY6xAIw20EY4v20xvaj40_Jr0_JF4lIxAIcVC2z280aV AFwI0_Jr0_Gr1lIxAIcVC2z280aVCY1x0267AKxVW8Jr0_Cr1UYxBIdaVFxhVjvjDU0xZF pf9x07j4fO7UUUUU= X-CM-SenderInfo: xkrx3t3r6k3tpzhluzxrxghudrp/ Content-Type: text/plain; charset="utf-8" From: Hou Tao The align parameter is used to align both the mapping of p2p dma memory and to enable the compound page for p2p dma memory in the kernel and in the userspace. Signed-off-by: Hou Tao --- drivers/accel/habanalabs/common/hldio.c | 3 +- drivers/nvme/host/pci.c | 2 +- drivers/pci/p2pdma.c | 38 +++++++++++++++++++++---- include/linux/pci-p2pdma.h | 4 +-- 4 files changed, 37 insertions(+), 10 deletions(-) diff --git a/drivers/accel/habanalabs/common/hldio.c b/drivers/accel/habana= labs/common/hldio.c index 083ae5610875..4d1528dbde9f 100644 --- a/drivers/accel/habanalabs/common/hldio.c +++ b/drivers/accel/habanalabs/common/hldio.c @@ -372,7 +372,8 @@ int hl_p2p_region_init(struct hl_device *hdev, struct h= l_p2p_region *p2pr) int rc, i; =20 /* Start by publishing our p2p memory */ - rc =3D pci_p2pdma_add_resource(hdev->pdev, p2pr->bar, p2pr->size, p2pr->b= ar_offset); + rc =3D pci_p2pdma_add_resource(hdev->pdev, p2pr->bar, p2pr->size, PAGE_SI= ZE, + p2pr->bar_offset); if (rc) { dev_err(hdev->dev, "error adding p2p resource: %d\n", rc); goto err; diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c index 0e4caeab739c..b070095bae5e 100644 --- a/drivers/nvme/host/pci.c +++ b/drivers/nvme/host/pci.c @@ -2309,7 +2309,7 @@ static void nvme_map_cmb(struct nvme_dev *dev) dev->bar + NVME_REG_CMBMSC); } =20 - if (pci_p2pdma_add_resource(pdev, bar, size, offset)) { + if (pci_p2pdma_add_resource(pdev, bar, size, PAGE_SIZE, offset)) { dev_warn(dev->ctrl.device, "failed to register the CMB\n"); hi_lo_writeq(0, dev->bar + NVME_REG_CMBMSC); diff --git a/drivers/pci/p2pdma.c b/drivers/pci/p2pdma.c index dd64ec830fdd..70482e240304 100644 --- a/drivers/pci/p2pdma.c +++ b/drivers/pci/p2pdma.c @@ -23,6 +23,7 @@ =20 struct pci_p2pdma { struct gen_pool *pool; + size_t align; bool p2pmem_published; struct xarray map_types; struct p2pdma_provider mem[PCI_STD_NUM_BARS]; @@ -211,7 +212,7 @@ static void p2pdma_folio_free(struct folio *folio) struct percpu_ref *ref; =20 gen_pool_free_owner(p2pdma->pool, (uintptr_t)page_to_virt(page), - PAGE_SIZE, (void **)&ref); + p2pdma->align, (void **)&ref); percpu_ref_put(ref); } =20 @@ -323,17 +324,22 @@ struct p2pdma_provider *pcim_p2pdma_provider(struct p= ci_dev *pdev, int bar) } EXPORT_SYMBOL_GPL(pcim_p2pdma_provider); =20 -static int pci_p2pdma_setup_pool(struct pci_dev *pdev) +static int pci_p2pdma_setup_pool(struct pci_dev *pdev, size_t align) { struct pci_p2pdma *p2pdma; int ret; =20 p2pdma =3D rcu_dereference_protected(pdev->p2pdma, 1); - if (p2pdma->pool) + if (p2pdma->pool) { + /* Two p2pdma BARs with different alignment ? */ + if (p2pdma->align !=3D align) + return -EINVAL; /* We already setup pools, do nothing, */ return 0; + } =20 - p2pdma->pool =3D gen_pool_create(PAGE_SHIFT, dev_to_node(&pdev->dev)); + p2pdma->align =3D align; + p2pdma->pool =3D gen_pool_create(ilog2(p2pdma->align), dev_to_node(&pdev-= >dev)); if (!p2pdma->pool) return -ENOMEM; =20 @@ -363,18 +369,31 @@ static void pci_p2pdma_unmap_mappings(void *data) p2pmem_group.name); } =20 +static inline int pci_p2pdma_check_pagemap_align(struct pci_dev *pdev, int= bar, + u64 size, size_t align, + u64 offset) +{ + if (align =3D=3D PAGE_SIZE) + return 0; + return -EINVAL; +} + /** * pci_p2pdma_add_resource - add memory for use as p2p memory * @pdev: the device to add the memory to * @bar: PCI BAR to add * @size: size of the memory to add, may be zero to use the whole BAR + * @align: dev memory mapping alignment of the memory to add. It is used + * to optimize the mappings both in userspace and kernel space when + * transparent huge page is supported. The possible values are + * PAGE_SIZE, PMD_SIZE, and PUD_SIZE. * @offset: offset into the PCI BAR * * The memory will be given ZONE_DEVICE struct pages so that it may * be used with any DMA request. */ int pci_p2pdma_add_resource(struct pci_dev *pdev, int bar, size_t size, - u64 offset) + size_t align, u64 offset) { struct pci_p2pdma_pagemap *p2p_pgmap; struct p2pdma_provider *mem; @@ -395,11 +414,18 @@ int pci_p2pdma_add_resource(struct pci_dev *pdev, int= bar, size_t size, if (size + offset > pci_resource_len(pdev, bar)) return -EINVAL; =20 + error =3D pci_p2pdma_check_pagemap_align(pdev, bar, size, align, offset); + if (error) { + pci_info_ratelimited(pdev, "invalid align 0x%zx for bar %d\n", + align, bar); + return error; + } + error =3D pcim_p2pdma_init(pdev); if (error) return error; =20 - error =3D pci_p2pdma_setup_pool(pdev); + error =3D pci_p2pdma_setup_pool(pdev, align); if (error) return error; =20 diff --git a/include/linux/pci-p2pdma.h b/include/linux/pci-p2pdma.h index 517e121d2598..2fa671274c45 100644 --- a/include/linux/pci-p2pdma.h +++ b/include/linux/pci-p2pdma.h @@ -69,7 +69,7 @@ enum pci_p2pdma_map_type { int pcim_p2pdma_init(struct pci_dev *pdev); struct p2pdma_provider *pcim_p2pdma_provider(struct pci_dev *pdev, int bar= ); int pci_p2pdma_add_resource(struct pci_dev *pdev, int bar, size_t size, - u64 offset); + size_t align, u64 offset); int pci_p2pdma_distance_many(struct pci_dev *provider, struct device **cli= ents, int num_clients, bool verbose); struct pci_dev *pci_p2pmem_find_many(struct device **clients, int num_clie= nts); @@ -97,7 +97,7 @@ static inline struct p2pdma_provider *pcim_p2pdma_provide= r(struct pci_dev *pdev, return NULL; } static inline int pci_p2pdma_add_resource(struct pci_dev *pdev, int bar, - size_t size, u64 offset) + size_t size, size_t align, u64 offset) { return -EOPNOTSUPP; } --=20 2.29.2 From nobody Mon Feb 9 08:59:50 2026 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 60F84285061; Sat, 20 Dec 2025 04:16:15 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1766204177; cv=none; b=XAFSYMr0Zld+ULGIx8SNWro6yiildavPg/gimZaK8Q27X0puG7sOj1Zb3ZbEPP9cjs5Ie/STn/71VEP8oJeBb6H+mysVKu35JjZNGdOuhXVBITpHB03YhgfXGy2hZTGA7/IsOUjo1cZqlBOOHMqBfIv0H1Bnj0Av3N8LwqrfeDs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1766204177; c=relaxed/simple; bh=QIqtJpxwLDrBV6EfDAgNIV07fN0e+g4BqWV78NQ76EI=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=cXBk3rjkqNAKAVW+uRC6PO+pWNwRMaoewk6+QUyQ7RZsWEu7RS6DvlLv0YCS8a5sKMTk+IMmCht3WJkNTv+ivuX3G8KDP4NYC04UH8o0Wv1Qk4+hBayQ5TiEkL5VD+KdNgVZdPBZo2Ppauye7iPcQIQiTJ9X2Lj/vkOBNIAKNm4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.177]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTPS id 4dYB080G58zYQthH; Sat, 20 Dec 2025 12:15:36 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id 5F3F84058A; Sat, 20 Dec 2025 12:16:07 +0800 (CST) Received: from huaweicloud.com (unknown [10.50.87.129]) by APP4 (Coremail) with SMTP id gCh0CgD3WPn5IkZpFwpFAw--.56015S11; Sat, 20 Dec 2025 12:16:07 +0800 (CST) From: Hou Tao To: linux-kernel@vger.kernel.org Cc: linux-pci@vger.kernel.org, linux-mm@kvack.org, linux-nvme@lists.infradead.org, Bjorn Helgaas , Logan Gunthorpe , Alistair Popple , Leon Romanovsky , Greg Kroah-Hartman , Tejun Heo , "Rafael J . Wysocki" , Danilo Krummrich , Andrew Morton , David Hildenbrand , Lorenzo Stoakes , Keith Busch , Jens Axboe , Christoph Hellwig , Sagi Grimberg , houtao1@huawei.com Subject: [PATCH 07/13] PCI/P2PDMA: create compound page for aligned p2pdma memory Date: Sat, 20 Dec 2025 12:04:40 +0800 Message-Id: <20251220040446.274991-8-houtao@huaweicloud.com> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20251220040446.274991-1-houtao@huaweicloud.com> References: <20251220040446.274991-1-houtao@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: gCh0CgD3WPn5IkZpFwpFAw--.56015S11 X-Coremail-Antispam: 1UD129KBjvdXoWrZF1fGF1xWrW3tryUtryrJFb_yoWkZFg_ur WxAwn7XF4kX3Zxtrn0kw1fZryjya97ur1kurs2qa4vyF18uan2vF9rAry5Xw18Ga1IqryD Cr47XF1I9r45KjkaLaAFLSUrUUUUjb8apTn2vfkv8UJUUUU8Yxn0WfASr-VFAUDa7-sFnT 9fnUUIcSsGvfJTRUUUbgAYFVCjjxCrM7AC8VAFwI0_Wr0E3s1l1xkIjI8I6I8E6xAIw20E Y4v20xvaj40_Wr0E3s1l1IIY67AEw4v_Jr0_Jr4l82xGYIkIc2x26280x7IE14v26r126s 0DM28IrcIa0xkI8VCY1x0267AKxVW5JVCq3wA2ocxC64kIII0Yj41l84x0c7CEw4AK67xG Y2AK021l84ACjcxK6xIIjxv20xvE14v26F1j6w1UM28EF7xvwVC0I7IYx2IY6xkF7I0E14 v26r4UJVWxJr1l84ACjcxK6I8E87Iv67AKxVW0oVCq3wA2z4x0Y4vEx4A2jsIEc7CjxVAF wI0_GcCE3s1le2I262IYc4CY6c8Ij28IcVAaY2xG8wAqx4xG64xvF2IEw4CE5I8CrVC2j2 WlYx0E2Ix0cI8IcVAFwI0_JF0_Jw1lYx0Ex4A2jsIE14v26r4UJVWxJr1lOx8S6xCaFVCj c4AY6r1j6r4UM4x0Y48IcxkI7VAKI48JM4IIrI8v6xkF7I0E8cxan2IY04v7MxkF7I0En4 kS14v26r4a6rW5MxAIw28IcxkI7VAKI48JMxC20s026xCaFVCjc4AY6r1j6r4UMI8I3I0E 5I8CrVAFwI0_Jr0_Jr4lx2IqxVCjr7xvwVAFwI0_JrI_JrWlx4CE17CEb7AF67AKxVW8ZV WrXwCIc40Y0x0EwIxGrwCI42IY6xIIjxv20xvE14v26r1I6r4UMIIF0xvE2Ix0cI8IcVCY 1x0267AKxVW8Jr0_Cr1UMIIF0xvE42xK8VAvwI8IcIk0rVWUJVWUCwCI42IY6I8E87Iv67 AKxVWUJVW8JwCI42IY6I8E87Iv6xkF7I0E14v26r4UJVWxJrUvcSsGvfC2KfnxnUUI43ZE Xa7IU0FJmUUUUUU== X-CM-SenderInfo: xkrx3t3r6k3tpzhluzxrxghudrp/ Content-Type: text/plain; charset="utf-8" From: Hou Tao Commit c4386bd8ee3a ("mm/memremap: add ZONE_DEVICE support for compound pages") has already supported compound page for ZONE_DEVICE memory. It not only decreases the memory overhead of ZONE_DEVICE memory through the deduplication of vmemmap pages, it also optimize the performance of get_user_pages when the memory is used for IO. As for now, the alignment of p2p dma memory is already known, setting vmemmap_shift accordingly to create compound page for p2pdma memory. Signed-off-by: Hou Tao --- drivers/pci/p2pdma.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/pci/p2pdma.c b/drivers/pci/p2pdma.c index 70482e240304..7180dea4855c 100644 --- a/drivers/pci/p2pdma.c +++ b/drivers/pci/p2pdma.c @@ -447,6 +447,8 @@ int pci_p2pdma_add_resource(struct pci_dev *pdev, int b= ar, size_t size, pgmap->nr_range =3D 1; pgmap->type =3D MEMORY_DEVICE_PCI_P2PDMA; pgmap->ops =3D &p2pdma_pgmap_ops; + if (align > PAGE_SIZE) + pgmap->vmemmap_shift =3D ilog2(align) - PAGE_SHIFT; p2p_pgmap->mem =3D mem; =20 addr =3D devm_memremap_pages(&pdev->dev, pgmap); --=20 2.29.2 From nobody Mon Feb 9 08:59:50 2026 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 60EB827EFE3; Sat, 20 Dec 2025 04:16:15 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1766204177; cv=none; b=pjTUlQp4YeUYKe+w7WUJdylk2QIvSToAa/OLDejg/CnVGd5iI0Lipji0bnAZY7je7hveHbUh7NaRLdaxFD5LN1ytH/Qv1DL9V5f6J84XGCwYpSXDMNZC1NYiWWGkCOo88LkZ+BNVxc9ZiATweIMSfN8+YFubSEBH4PwbnSlNtoc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1766204177; c=relaxed/simple; bh=sWxn4m0VtswJJMDsL52vwPsZZi6e7XbRViKbtvda+po=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=O5i/Mc4fROlD0xzG8EEs/7k/svmMHajURC7j/2CH+4P1WsTwpbo4G6PgH2bwgjngtFmixjyV0GgK9XZutjx88d1RMFjWWn6pHg1L8xxz+oarScq2/pzqzEuMNYLv8QV6E/MdL2f6W1w63g++//ImboAInXGgvpBqhrefbqsptDM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.177]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTPS id 4dYB0817ZkzYQtfq; Sat, 20 Dec 2025 12:15:36 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id 7DF374058F; Sat, 20 Dec 2025 12:16:07 +0800 (CST) Received: from huaweicloud.com (unknown [10.50.87.129]) by APP4 (Coremail) with SMTP id gCh0CgD3WPn5IkZpFwpFAw--.56015S12; Sat, 20 Dec 2025 12:16:07 +0800 (CST) From: Hou Tao To: linux-kernel@vger.kernel.org Cc: linux-pci@vger.kernel.org, linux-mm@kvack.org, linux-nvme@lists.infradead.org, Bjorn Helgaas , Logan Gunthorpe , Alistair Popple , Leon Romanovsky , Greg Kroah-Hartman , Tejun Heo , "Rafael J . Wysocki" , Danilo Krummrich , Andrew Morton , David Hildenbrand , Lorenzo Stoakes , Keith Busch , Jens Axboe , Christoph Hellwig , Sagi Grimberg , houtao1@huawei.com Subject: [PATCH 08/13] mm/huge_memory: add helpers to insert huge page during mmap Date: Sat, 20 Dec 2025 12:04:41 +0800 Message-Id: <20251220040446.274991-9-houtao@huaweicloud.com> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20251220040446.274991-1-houtao@huaweicloud.com> References: <20251220040446.274991-1-houtao@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: gCh0CgD3WPn5IkZpFwpFAw--.56015S12 X-Coremail-Antispam: 1UD129KBjvJXoWxZryfJFyfCF45JrW8WF47twb_yoW5uF17pF 97GFn8ZrWIqrnrurnxWFs8Ary3X3yxWayUKFW7WF1ava17t34F9a1kJw15tF15JryUCFs3 Xa17GFy5uFyUWa7anT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUPlb4IE77IF4wAFF20E14v26rWj6s0DM7CY07I20VC2zVCF04k2 6cxKx2IYs7xG6rWj6s0DM7CIcVAFz4kK6r1j6r18M28IrcIa0xkI8VA2jI8067AKxVWUAV Cq3wA2048vs2IY020Ec7CjxVAFwI0_Xr0E3s1l8cAvFVAK0II2c7xJM28CjxkF64kEwVA0 rcxSw2x7M28EF7xvwVC0I7IYx2IY67AKxVW7JVWDJwA2z4x0Y4vE2Ix0cI8IcVCY1x0267 AKxVW8Jr0_Cr1UM28EF7xvwVC2z280aVAFwI0_GcCE3s1l84ACjcxK6I8E87Iv6xkF7I0E 14v26rxl6s0DM2AIxVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40Ex7 xfMcIj6xIIjxv20xvE14v26r126r1DMcIj6I8E87Iv67AKxVW8Jr0_Cr1UMcvjeVCFs4IE 7xkEbVWUJVW8JwACjcxG0xvY0x0EwIxGrwACI402YVCY1x02628vn2kIc2xKxwCY1x0262 kKe7AKxVW8ZVWrXwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkEbVWUJVW8JwC20s02 6c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67AF67kF1VAFwI0_GF v_WrylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUCVW8JwCI42IY6xIIjxv20xvE c7CjxVAFwI0_Gr1j6F4UJwCI42IY6xAIw20EY4v20xvaj40_Jr0_JF4lIxAIcVC2z280aV AFwI0_Jr0_Gr1lIxAIcVC2z280aVCY1x0267AKxVW8Jr0_Cr1UYxBIdaVFxhVjvjDU0xZF pf9x07j4fO7UUUUU= X-CM-SenderInfo: xkrx3t3r6k3tpzhluzxrxghudrp/ Content-Type: text/plain; charset="utf-8" From: Hou Tao vmf_insert_folio_{pmd,pud}() can be used to insert huge page during page fault. However, for simplicity, the mapping of p2pdma memory inserts all necessary pages during mmap. Therefore, add vm_insert_folio_{pmd|pud} helpers to support inserting pmd-sized and pud-sized page during mmap. Signed-off-by: Hou Tao --- include/linux/huge_mm.h | 4 +++ mm/huge_memory.c | 66 +++++++++++++++++++++++++++++++++++++++++ 2 files changed, 70 insertions(+) diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h index a4d9f964dfde..8cf8bb85be79 100644 --- a/include/linux/huge_mm.h +++ b/include/linux/huge_mm.h @@ -45,6 +45,10 @@ vm_fault_t vmf_insert_folio_pmd(struct vm_fault *vmf, st= ruct folio *folio, bool write); vm_fault_t vmf_insert_folio_pud(struct vm_fault *vmf, struct folio *folio, bool write); +int vm_insert_folio_pmd(struct vm_area_struct *vma, unsigned long addr, + struct folio *folio); +int vm_insert_folio_pud(struct vm_area_struct *vma, unsigned long addr, + struct folio *folio); =20 enum transparent_hugepage_flag { TRANSPARENT_HUGEPAGE_UNSUPPORTED, diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 40cf59301c21..11d19f8986da 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -1644,6 +1644,41 @@ vm_fault_t vmf_insert_folio_pmd(struct vm_fault *vmf= , struct folio *folio, } EXPORT_SYMBOL_GPL(vmf_insert_folio_pmd); =20 +int vm_insert_folio_pmd(struct vm_area_struct *vma, unsigned long addr, + struct folio *folio) +{ + struct mm_struct *mm =3D vma->vm_mm; + struct folio_or_pfn fop =3D { + .folio =3D folio, + .is_folio =3D true, + }; + pgd_t *pgd; + p4d_t *p4d; + pud_t *pud; + pmd_t *pmd; + vm_fault_t fault_err; + + mmap_assert_write_locked(mm); + + pgd =3D pgd_offset(mm, addr); + p4d =3D p4d_alloc(mm, pgd, addr); + if (!p4d) + return -ENOMEM; + pud =3D pud_alloc(mm, p4d, addr); + if (!pud) + return -ENOMEM; + pmd =3D pmd_alloc(mm, pud, addr); + if (!pmd) + return -ENOMEM; + + fault_err =3D insert_pmd(vma, addr, pmd, fop, vma->vm_page_prot, + vma->vm_flags & VM_WRITE); + if (fault_err !=3D VM_FAULT_NOPAGE) + return -EINVAL; + + return 0; +} + #ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD static pud_t maybe_pud_mkwrite(pud_t pud, struct vm_area_struct *vma) { @@ -1759,6 +1794,37 @@ vm_fault_t vmf_insert_folio_pud(struct vm_fault *vmf= , struct folio *folio, return insert_pud(vma, addr, vmf->pud, fop, vma->vm_page_prot, write); } EXPORT_SYMBOL_GPL(vmf_insert_folio_pud); + +int vm_insert_folio_pud(struct vm_area_struct *vma, unsigned long addr, + struct folio *folio) +{ + struct mm_struct *mm =3D vma->vm_mm; + struct folio_or_pfn fop =3D { + .folio =3D folio, + .is_folio =3D true, + }; + pgd_t *pgd; + p4d_t *p4d; + pud_t *pud; + vm_fault_t fault_err; + + mmap_assert_write_locked(mm); + + pgd =3D pgd_offset(mm, addr); + p4d =3D p4d_alloc(mm, pgd, addr); + if (!p4d) + return -ENOMEM; + pud =3D pud_alloc(mm, p4d, addr); + if (!pud) + return -ENOMEM; + + fault_err =3D insert_pud(vma, addr, pud, fop, vma->vm_page_prot, + vma->vm_flags & VM_WRITE); + if (fault_err !=3D VM_FAULT_NOPAGE) + return -EINVAL; + + return 0; +} #endif /* CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD */ =20 /** --=20 2.29.2 From nobody Mon Feb 9 08:59:50 2026 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2E88D2F5A09; Sat, 20 Dec 2025 04:16:19 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1766204181; cv=none; b=VYpcOLT137Bf5jKVqcGgYcCxYCd63ipwDw0n3JKm4IZcl7SpVavG/QwREP0AxvDKgpV4K5Y6P+YvqgELMUonSVT+QZl1q4gdv8xQp0HlY25kZ50T/GjSE0kPGrkBo4ZPhW3KIH5YT3ajkfefdM/7Shd2ErEo6dHq6XY/+8Pp9Z0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1766204181; c=relaxed/simple; bh=vpg4Y2CuK0BF6n9ne9C/TlupCZKIO9PBgYQ/GswlUyM=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=buBGnhsHm4Fnjhoivw3eCdEXO0MQQrGvA/tDBUYua2TSCjC5BUQynXVSwjySkMR37FW1gwrKJI0Ai1IDcS4rklWe+68bK2/xmyAf5FSefUABed0FNCf+HkutdWXTGcMG+3In+e6O5l9vx4KROm2I9CHfJ1FZPJf4fvuOKZ8mN/M= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.170]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTPS id 4dYB081cnVzYQtj5; Sat, 20 Dec 2025 12:15:36 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id 8ED824056D; Sat, 20 Dec 2025 12:16:07 +0800 (CST) Received: from huaweicloud.com (unknown [10.50.87.129]) by APP4 (Coremail) with SMTP id gCh0CgD3WPn5IkZpFwpFAw--.56015S13; Sat, 20 Dec 2025 12:16:07 +0800 (CST) From: Hou Tao To: linux-kernel@vger.kernel.org Cc: linux-pci@vger.kernel.org, linux-mm@kvack.org, linux-nvme@lists.infradead.org, Bjorn Helgaas , Logan Gunthorpe , Alistair Popple , Leon Romanovsky , Greg Kroah-Hartman , Tejun Heo , "Rafael J . Wysocki" , Danilo Krummrich , Andrew Morton , David Hildenbrand , Lorenzo Stoakes , Keith Busch , Jens Axboe , Christoph Hellwig , Sagi Grimberg , houtao1@huawei.com Subject: [PATCH 09/13] PCI/P2PDMA: support get_unmapped_area to return aligned vaddr Date: Sat, 20 Dec 2025 12:04:42 +0800 Message-Id: <20251220040446.274991-10-houtao@huaweicloud.com> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20251220040446.274991-1-houtao@huaweicloud.com> References: <20251220040446.274991-1-houtao@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: gCh0CgD3WPn5IkZpFwpFAw--.56015S13 X-Coremail-Antispam: 1UD129KBjvJXoW7uw1UWFW5JF1UAF15GF15urg_yoW8Kr18pF WrtF98JrW8twsrKFWaya1DZry3Wwn5KryjkrWxK34a93W3GFnxWay5Aa4YqF13J34kW3W7 tanIkr47urWDJ3DanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUPlb4IE77IF4wAFF20E14v26rWj6s0DM7CY07I20VC2zVCF04k2 6cxKx2IYs7xG6rWj6s0DM7CIcVAFz4kK6r1j6r18M28IrcIa0xkI8VA2jI8067AKxVWUAV Cq3wA2048vs2IY020Ec7CjxVAFwI0_Xr0E3s1l8cAvFVAK0II2c7xJM28CjxkF64kEwVA0 rcxSw2x7M28EF7xvwVC0I7IYx2IY67AKxVW7JVWDJwA2z4x0Y4vE2Ix0cI8IcVCY1x0267 AKxVW8Jr0_Cr1UM28EF7xvwVC2z280aVAFwI0_GcCE3s1l84ACjcxK6I8E87Iv6xkF7I0E 14v26rxl6s0DM2AIxVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40Ex7 xfMcIj6xIIjxv20xvE14v26r126r1DMcIj6I8E87Iv67AKxVW8Jr0_Cr1UMcvjeVCFs4IE 7xkEbVWUJVW8JwACjcxG0xvY0x0EwIxGrwACI402YVCY1x02628vn2kIc2xKxwCY1x0262 kKe7AKxVW8ZVWrXwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkEbVWUJVW8JwC20s02 6c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67AF67kF1VAFwI0_GF v_WrylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUCVW8JwCI42IY6xIIjxv20xvE c7CjxVAFwI0_Gr1j6F4UJwCI42IY6xAIw20EY4v20xvaj40_Jr0_JF4lIxAIcVC2z280aV AFwI0_Jr0_Gr1lIxAIcVC2z280aVCY1x0267AKxVW8Jr0_Cr1UYxBIdaVFxhVjvjDU0xZF pf9x07j4fO7UUUUU= X-CM-SenderInfo: xkrx3t3r6k3tpzhluzxrxghudrp/ Content-Type: text/plain; charset="utf-8" From: Hou Tao P2PDMA memory already supports compound page. When mmapping the P2PDMA memory in the userspace, the mmap procedure needs to use an aligned virtual address to match the alignment of P2PDMA memory. Therefore, implement get_unmapped_area for p2pdma memory to return an aligned virtual address. Signed-off-by: Hou Tao --- drivers/pci/p2pdma.c | 39 +++++++++++++++++++++++++++++++++++++++ 1 file changed, 39 insertions(+) diff --git a/drivers/pci/p2pdma.c b/drivers/pci/p2pdma.c index 7180dea4855c..e97f5da73458 100644 --- a/drivers/pci/p2pdma.c +++ b/drivers/pci/p2pdma.c @@ -90,6 +90,44 @@ static ssize_t published_show(struct device *dev, struct= device_attribute *attr, } static DEVICE_ATTR_RO(published); =20 +static unsigned long p2pmem_get_unmapped_area(struct file *filp, struct ko= bject *kobj, + const struct bin_attribute *attr, + unsigned long uaddr, unsigned long len, + unsigned long pgoff, unsigned long flags) +{ + struct pci_dev *pdev =3D to_pci_dev(kobj_to_dev(kobj)); + struct pci_p2pdma *p2pdma; + unsigned long aligned_len; + unsigned long addr; + unsigned long align; + + if (pgoff) + return -EINVAL; + + rcu_read_lock(); + p2pdma =3D rcu_dereference(pdev->p2pdma); + if (!p2pdma) { + rcu_read_unlock(); + return -ENODEV; + } + align =3D p2pdma->align; + rcu_read_unlock(); + + /* Fixed address */ + if (uaddr) + goto out; + + aligned_len =3D len + align; + if (aligned_len < len) + goto out; + + addr =3D mm_get_unmapped_area(filp, uaddr, aligned_len, pgoff, flags); + if (!IS_ERR_VALUE(addr)) + return round_up(addr, align); +out: + return mm_get_unmapped_area(filp, uaddr, len, pgoff, flags); +} + static int p2pmem_alloc_mmap(struct file *filp, struct kobject *kobj, const struct bin_attribute *attr, struct vm_area_struct *vma) { @@ -175,6 +213,7 @@ static int p2pmem_alloc_mmap(struct file *filp, struct = kobject *kobj, static const struct bin_attribute p2pmem_alloc_attr =3D { .attr =3D { .name =3D "allocate", .mode =3D 0660 }, .mmap =3D p2pmem_alloc_mmap, + .get_unmapped_area =3D p2pmem_get_unmapped_area, /* * Some places where we want to call mmap (ie. python) will check * that the file size is greater than the mmap size before allowing --=20 2.29.2 From nobody Mon Feb 9 08:59:50 2026 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 56D4C2F5A2D; Sat, 20 Dec 2025 04:16:19 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1766204181; cv=none; b=thmAgigzAF49NChsNSniL4xIJJ75DyQeNFG+qi7Gni0hTZy26p9KxPS2Am7p0sdDRZlYA7wN4Jflv+FEpa34OqECJl4CyG6Wr6epwd+tA9zDCz/SMcJETaizvcNsUz2XprCk97dDONS2xVdXW1O+ZIlLkKmKflDunF2m0jJ9K1k= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1766204181; c=relaxed/simple; bh=QfEBpFqUgZYOMMZuAt2vGEic4dWJdxswcnkCytVkgYk=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=Y0KD11mRLqvXNI/KXfO2Ml6JM5Zo+gV3eKLarFoXecjb7t1MGxdWCIyFSlP4gVkinmQZp5QTOuRCOd96Cyr5oy4vicpUGNwtrOmiQqoqdxNQtV4WD4CO3cZLqGpUJswi0sl8w4v7uSi1p2V3m2WXL/m/SPxwMAfnIVgJZCFy9kc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.198]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTPS id 4dYB082HJRzYQtjR; Sat, 20 Dec 2025 12:15:36 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id A3DD140574; Sat, 20 Dec 2025 12:16:07 +0800 (CST) Received: from huaweicloud.com (unknown [10.50.87.129]) by APP4 (Coremail) with SMTP id gCh0CgD3WPn5IkZpFwpFAw--.56015S14; Sat, 20 Dec 2025 12:16:07 +0800 (CST) From: Hou Tao To: linux-kernel@vger.kernel.org Cc: linux-pci@vger.kernel.org, linux-mm@kvack.org, linux-nvme@lists.infradead.org, Bjorn Helgaas , Logan Gunthorpe , Alistair Popple , Leon Romanovsky , Greg Kroah-Hartman , Tejun Heo , "Rafael J . Wysocki" , Danilo Krummrich , Andrew Morton , David Hildenbrand , Lorenzo Stoakes , Keith Busch , Jens Axboe , Christoph Hellwig , Sagi Grimberg , houtao1@huawei.com Subject: [PATCH 10/13] PCI/P2PDMA: support compound page in p2pmem_alloc_mmap() Date: Sat, 20 Dec 2025 12:04:43 +0800 Message-Id: <20251220040446.274991-11-houtao@huaweicloud.com> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20251220040446.274991-1-houtao@huaweicloud.com> References: <20251220040446.274991-1-houtao@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: gCh0CgD3WPn5IkZpFwpFAw--.56015S14 X-Coremail-Antispam: 1UD129KBjvJXoWxurykGr1DCF13Gw1UJF4xJFb_yoWrWr4rpF WrK3WqqayrGw42gw13Aa1DuFyavw1vg3yUta4xK34I9F1aqFWY9F18JFyYqF4YkrykWr1S qF4Dtr1UuFs0k3DanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUPlb4IE77IF4wAFF20E14v26rWj6s0DM7CY07I20VC2zVCF04k2 6cxKx2IYs7xG6rWj6s0DM7CIcVAFz4kK6r1j6r18M28IrcIa0xkI8VA2jI8067AKxVWUAV Cq3wA2048vs2IY020Ec7CjxVAFwI0_Xr0E3s1l8cAvFVAK0II2c7xJM28CjxkF64kEwVA0 rcxSw2x7M28EF7xvwVC0I7IYx2IY67AKxVW7JVWDJwA2z4x0Y4vE2Ix0cI8IcVCY1x0267 AKxVW8Jr0_Cr1UM28EF7xvwVC2z280aVAFwI0_GcCE3s1l84ACjcxK6I8E87Iv6xkF7I0E 14v26rxl6s0DM2AIxVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40Ex7 xfMcIj6xIIjxv20xvE14v26r126r1DMcIj6I8E87Iv67AKxVW8Jr0_Cr1UMcvjeVCFs4IE 7xkEbVWUJVW8JwACjcxG0xvY0x0EwIxGrwACI402YVCY1x02628vn2kIc2xKxwCY1x0262 kKe7AKxVW8ZVWrXwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkEbVWUJVW8JwC20s02 6c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67AF67kF1VAFwI0_GF v_WrylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVW8JVW5JwCI42IY6xIIjxv20xvE c7CjxVAFwI0_Gr1j6F4UJwCI42IY6xAIw20EY4v20xvaj40_Jr0_JF4lIxAIcVC2z280aV AFwI0_Gr0_Cr1lIxAIcVC2z280aVCY1x0267AKxVW8Jr0_Cr1UYxBIdaVFxhVjvjDU0xZF pf9x07j4fO7UUUUU= X-CM-SenderInfo: xkrx3t3r6k3tpzhluzxrxghudrp/ Content-Type: text/plain; charset="utf-8" From: Hou Tao P2PDMA memory has already supported compound page and the helpers which support inserting compound page into vma is also ready, therefore, add support for compound page in p2pmem_alloc_mmap() as well. It will reduce the overhead of mmap() and get_user_pages() a lot when compound page is enabled for p2pdma memory. The use of vm_private_data to save the alignment of p2pdma memory needs explanation. The normal way to get the alignment is through pci_dev. It can be achieved by either invoking kernfs_of() and sysfs_file_kobj() or defining a new struct kernfs_vm_ops to pass the kobject to the may_split() and ->pagesize() callbacks. The former approach depends too much on kernfs implementation details, and the latter would lead to excessive churn. Therefore, choose the simpler way of saving alignment in vm_private_data instead. Signed-off-by: Hou Tao --- drivers/pci/p2pdma.c | 48 ++++++++++++++++++++++++++++++++++++++++---- 1 file changed, 44 insertions(+), 4 deletions(-) diff --git a/drivers/pci/p2pdma.c b/drivers/pci/p2pdma.c index e97f5da73458..4a133219ac43 100644 --- a/drivers/pci/p2pdma.c +++ b/drivers/pci/p2pdma.c @@ -128,6 +128,25 @@ static unsigned long p2pmem_get_unmapped_area(struct f= ile *filp, struct kobject return mm_get_unmapped_area(filp, uaddr, len, pgoff, flags); } =20 +static int p2pmem_may_split(struct vm_area_struct *vma, unsigned long addr) +{ + size_t align =3D (uintptr_t)vma->vm_private_data; + + if (!IS_ALIGNED(addr, align)) + return -EINVAL; + return 0; +} + +static unsigned long p2pmem_pagesize(struct vm_area_struct *vma) +{ + return (uintptr_t)vma->vm_private_data; +} + +static const struct vm_operations_struct p2pmem_vm_ops =3D { + .may_split =3D p2pmem_may_split, + .pagesize =3D p2pmem_pagesize, +}; + static int p2pmem_alloc_mmap(struct file *filp, struct kobject *kobj, const struct bin_attribute *attr, struct vm_area_struct *vma) { @@ -136,6 +155,7 @@ static int p2pmem_alloc_mmap(struct file *filp, struct = kobject *kobj, struct pci_p2pdma *p2pdma; struct percpu_ref *ref; unsigned long vaddr; + size_t align; void *kaddr; int ret; =20 @@ -161,6 +181,16 @@ static int p2pmem_alloc_mmap(struct file *filp, struct= kobject *kobj, goto out; } =20 + align =3D p2pdma->align; + if (vma->vm_start & (align - 1) || vma->vm_end & (align - 1)) { + pci_info_ratelimited(pdev, + "%s: unaligned vma (%#lx~%#lx, %#lx)\n", + current->comm, vma->vm_start, vma->vm_end, + align); + ret =3D -EINVAL; + goto out; + } + kaddr =3D (void *)gen_pool_alloc_owner(p2pdma->pool, len, (void **)&ref); if (!kaddr) { ret =3D -ENOMEM; @@ -178,7 +208,7 @@ static int p2pmem_alloc_mmap(struct file *filp, struct = kobject *kobj, } rcu_read_unlock(); =20 - for (vaddr =3D vma->vm_start; vaddr < vma->vm_end; vaddr +=3D PAGE_SIZE) { + for (vaddr =3D vma->vm_start; vaddr < vma->vm_end; vaddr +=3D align) { struct page *page =3D virt_to_page(kaddr); =20 /* @@ -188,7 +218,12 @@ static int p2pmem_alloc_mmap(struct file *filp, struct= kobject *kobj, */ VM_WARN_ON_ONCE_PAGE(page_ref_count(page), page); set_page_count(page, 1); - ret =3D vm_insert_page(vma, vaddr, page); + if (align =3D=3D PUD_SIZE) + ret =3D vm_insert_folio_pud(vma, vaddr, page_folio(page)); + else if (align =3D=3D PMD_SIZE) + ret =3D vm_insert_folio_pmd(vma, vaddr, page_folio(page)); + else + ret =3D vm_insert_page(vma, vaddr, page); if (ret) { gen_pool_free(p2pdma->pool, (uintptr_t)kaddr, len); percpu_ref_put(ref); @@ -196,10 +231,15 @@ static int p2pmem_alloc_mmap(struct file *filp, struc= t kobject *kobj, } percpu_ref_get(ref); put_page(page); - kaddr +=3D PAGE_SIZE; - len -=3D PAGE_SIZE; + kaddr +=3D align; + len -=3D align; } =20 + /* Disable unaligned splitting due to vma merge */ + vm_flags_set(vma, VM_DONTEXPAND); + vma->vm_ops =3D &p2pmem_vm_ops; + vma->vm_private_data =3D (void *)(uintptr_t)align; + percpu_ref_put(ref); =20 return 0; --=20 2.29.2 From nobody Mon Feb 9 08:59:50 2026 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 824412F6173; Sat, 20 Dec 2025 04:16:19 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1766204181; cv=none; b=TX6J56JnIGAk9yD4zhFUImudM1921rLP+71Mbdql0D1a2swBM2M6Et+N9SV+IVq4Kqs0NBC38oRBNmF/LD7LwTqw2lI2ghqKXC5qvOMpVOPIjLDAUh3r1tc/+TK2dIeBVzIJLgbumlvnncvu7Kc7tSQLDTq2g8ndXbZCEydjCS8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1766204181; c=relaxed/simple; bh=UfTzxFmeBxMi2StwimfIMpRVlZpdNpNwvrM7o3WjFLA=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=jkcjmxnRZFRZoYDO3oKJgD4WzQTrpuMsBBz5C2vfo+laVM7KMjN+M6H5sPDMF2tstVGfg+HVzy7Z/k6t89OBPnbKetnxv991U5xi03y61aETG0yJQDEsz25EQCRWkXk2Y+WQEQ388Cz9c8pV3OhHpEOfmRqa5SWKRBG38p3ILFE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.198]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTPS id 4dYB082vZWzYQtjX; Sat, 20 Dec 2025 12:15:36 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id B52A740573; Sat, 20 Dec 2025 12:16:07 +0800 (CST) Received: from huaweicloud.com (unknown [10.50.87.129]) by APP4 (Coremail) with SMTP id gCh0CgD3WPn5IkZpFwpFAw--.56015S15; Sat, 20 Dec 2025 12:16:07 +0800 (CST) From: Hou Tao To: linux-kernel@vger.kernel.org Cc: linux-pci@vger.kernel.org, linux-mm@kvack.org, linux-nvme@lists.infradead.org, Bjorn Helgaas , Logan Gunthorpe , Alistair Popple , Leon Romanovsky , Greg Kroah-Hartman , Tejun Heo , "Rafael J . Wysocki" , Danilo Krummrich , Andrew Morton , David Hildenbrand , Lorenzo Stoakes , Keith Busch , Jens Axboe , Christoph Hellwig , Sagi Grimberg , houtao1@huawei.com Subject: [PATCH 11/13] PCI/P2PDMA: add helper pci_p2pdma_max_pagemap_align() Date: Sat, 20 Dec 2025 12:04:44 +0800 Message-Id: <20251220040446.274991-12-houtao@huaweicloud.com> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20251220040446.274991-1-houtao@huaweicloud.com> References: <20251220040446.274991-1-houtao@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: gCh0CgD3WPn5IkZpFwpFAw--.56015S15 X-Coremail-Antispam: 1UD129KBjvJXoWxJF15Gw1rWr1fKr48Xr48Crg_yoW8Ww43pF 1kAFZ5Xr18KF47Ar9xA3Z0k3ZYvrs3Ca42krW3Kan7ZFy7Jws5Kr4UGF1Ygr1rWrWvkrWf JrsayF4Fk3sxt3DanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUPlb4IE77IF4wAFF20E14v26rWj6s0DM7CY07I20VC2zVCF04k2 6cxKx2IYs7xG6rWj6s0DM7CIcVAFz4kK6r1j6r18M28IrcIa0xkI8VA2jI8067AKxVWUAV Cq3wA2048vs2IY020Ec7CjxVAFwI0_Xr0E3s1l8cAvFVAK0II2c7xJM28CjxkF64kEwVA0 rcxSw2x7M28EF7xvwVC0I7IYx2IY67AKxVW7JVWDJwA2z4x0Y4vE2Ix0cI8IcVCY1x0267 AKxVW8Jr0_Cr1UM28EF7xvwVC2z280aVAFwI0_GcCE3s1l84ACjcxK6I8E87Iv6xkF7I0E 14v26rxl6s0DM2AIxVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40Ex7 xfMcIj6xIIjxv20xvE14v26r126r1DMcIj6I8E87Iv67AKxVW8Jr0_Cr1UMcvjeVCFs4IE 7xkEbVWUJVW8JwACjcxG0xvY0x0EwIxGrwACI402YVCY1x02628vn2kIc2xKxwCY1x0262 kKe7AKxVW8ZVWrXwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkEbVWUJVW8JwC20s02 6c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67AF67kF1VAFwI0_GF v_WrylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVW8JVW5JwCI42IY6xIIjxv20xvE c7CjxVAFwI0_Gr1j6F4UJwCI42IY6xAIw20EY4v20xvaj40_Jr0_JF4lIxAIcVC2z280aV AFwI0_Gr0_Cr1lIxAIcVC2z280aVCY1x0267AKxVW8Jr0_Cr1UYxBIdaVFxhVjvjDU0xZF pf9x07j4fO7UUUUU= X-CM-SenderInfo: xkrx3t3r6k3tpzhluzxrxghudrp/ Content-Type: text/plain; charset="utf-8" From: Hou Tao Add helper pci_p2pdma_max_pagemap_align() to find the max possible alignment for p2p dma memory mapping in both userspace and kernel space. When huge transparent page is supported, and the physical address, the size of the BAR and the offset is {PUD|PMM}_SIZE aligned, it returns {PUD|PMD_SIZE} accordingly. Otherwise, it returns PAGE_SIZE. Signed-off-by: Hou Tao --- include/linux/pci-p2pdma.h | 26 ++++++++++++++++++++++++++ 1 file changed, 26 insertions(+) diff --git a/include/linux/pci-p2pdma.h b/include/linux/pci-p2pdma.h index 2fa671274c45..5d940b9e5338 100644 --- a/include/linux/pci-p2pdma.h +++ b/include/linux/pci-p2pdma.h @@ -210,4 +210,30 @@ pci_p2pdma_bus_addr_map(struct p2pdma_provider *provid= er, phys_addr_t paddr) return paddr + provider->bus_offset; } =20 +#ifdef CONFIG_TRANSPARENT_HUGEPAGE +static inline size_t pci_p2pdma_max_pagemap_align(struct pci_dev *pdev, in= t bar, + u64 size, u64 offset) +{ + resource_size_t start =3D pci_resource_start(pdev, bar); + + if (has_transparent_pud_hugepage() && + IS_ALIGNED(start, PUD_SIZE) && IS_ALIGNED(size, PUD_SIZE) && + IS_ALIGNED(offset, PUD_SIZE)) + return PUD_SIZE; + + if (has_transparent_hugepage() && + IS_ALIGNED(start, PMD_SIZE) && IS_ALIGNED(size, PMD_SIZE) && + IS_ALIGNED(offset, PMD_SIZE)) + return PMD_SIZE; + + return PAGE_SIZE; +} +#else +static inline size_t pci_p2pdma_max_pagemap_align(resource_size_t start, + u64 size, u64 offset) +{ + return PAGE_SIZE; +} +#endif /* CONFIG_TRANSPARENT_HUGEPAGE */ + #endif /* _LINUX_PCI_P2P_H */ --=20 2.29.2 From nobody Mon Feb 9 08:59:50 2026 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8265C2F6179; Sat, 20 Dec 2025 04:16:19 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1766204182; cv=none; b=usYoqDOk/ihQ0IC6c7n7INZ6IFPNAu+gXTKUbt4MT8x4dJd3qfjCBRpEISoGAlT4sRWIEpZWRd0JzqCucCat9gBg6PuTmWBFVfHbn3LL27krT4EJKaXyiU75yHTPm8or3KRJ9j8U2D1WmPxY98AYwx7kfC40KvZgkzSnRS8J79E= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1766204182; c=relaxed/simple; bh=hqrA2BuPhIDII7vCGnfPJOimI5mCMKKMhO1z38FPsSg=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=hD6hMORoFVi11tVkWtMPy2y13TaVeFHz98JSHywBUmpR81tihA1DIVqi3xCvTCi3V3P5qdY91Y3NWR/FKwgqVZDnZOCLgD+mQXO+KGS4zqSOfv7mZLzGWDSv0tnySH27bx2juuR23lmkJL+ZEVzgpnqFYw8L3mAZlUNtQSkMLa4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=none smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.198]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTPS id 4dYB083ZvgzYQtfS; Sat, 20 Dec 2025 12:15:36 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id D046E4056B; Sat, 20 Dec 2025 12:16:07 +0800 (CST) Received: from huaweicloud.com (unknown [10.50.87.129]) by APP4 (Coremail) with SMTP id gCh0CgD3WPn5IkZpFwpFAw--.56015S16; Sat, 20 Dec 2025 12:16:07 +0800 (CST) From: Hou Tao To: linux-kernel@vger.kernel.org Cc: linux-pci@vger.kernel.org, linux-mm@kvack.org, linux-nvme@lists.infradead.org, Bjorn Helgaas , Logan Gunthorpe , Alistair Popple , Leon Romanovsky , Greg Kroah-Hartman , Tejun Heo , "Rafael J . Wysocki" , Danilo Krummrich , Andrew Morton , David Hildenbrand , Lorenzo Stoakes , Keith Busch , Jens Axboe , Christoph Hellwig , Sagi Grimberg , houtao1@huawei.com Subject: [PATCH 12/13] nvme-pci: introduce cmb_devmap_align module parameter Date: Sat, 20 Dec 2025 12:04:45 +0800 Message-Id: <20251220040446.274991-13-houtao@huaweicloud.com> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20251220040446.274991-1-houtao@huaweicloud.com> References: <20251220040446.274991-1-houtao@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: gCh0CgD3WPn5IkZpFwpFAw--.56015S16 X-Coremail-Antispam: 1UD129KBjvJXoW7AFykWw48tw1xKr1xWF43trb_yoW8try3pa 4DZFn8XFWakF1ay3yaywsrZas8Zw4v93yUAFW3Gw17u347tFZayFyUGa4YgFy5WrWDuF13 AF42kryxWay7AaDanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUPlb4IE77IF4wAFF20E14v26rWj6s0DM7CY07I20VC2zVCF04k2 6cxKx2IYs7xG6rWj6s0DM7CIcVAFz4kK6r1j6r18M28IrcIa0xkI8VA2jI8067AKxVWUAV Cq3wA2048vs2IY020Ec7CjxVAFwI0_Xr0E3s1l8cAvFVAK0II2c7xJM28CjxkF64kEwVA0 rcxSw2x7M28EF7xvwVC0I7IYx2IY67AKxVW7JVWDJwA2z4x0Y4vE2Ix0cI8IcVCY1x0267 AKxVW8Jr0_Cr1UM28EF7xvwVC2z280aVAFwI0_GcCE3s1l84ACjcxK6I8E87Iv6xkF7I0E 14v26rxl6s0DM2AIxVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40Ex7 xfMcIj6xIIjxv20xvE14v26r126r1DMcIj6I8E87Iv67AKxVW8Jr0_Cr1UMcvjeVCFs4IE 7xkEbVWUJVW8JwACjcxG0xvY0x0EwIxGrwACI402YVCY1x02628vn2kIc2xKxwCY1x0262 kKe7AKxVW8ZVWrXwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkEbVWUJVW8JwC20s02 6c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67AF67kF1VAFwI0_GF v_WrylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVW8JVW5JwCI42IY6xIIjxv20xvE c7CjxVAFwI0_Gr1j6F4UJwCI42IY6xAIw20EY4v20xvaj40_Jr0_JF4lIxAIcVC2z280aV AFwI0_Gr0_Cr1lIxAIcVC2z280aVCY1x0267AKxVW8Jr0_Cr1UYxBIdaVFxhVjvjDU0xZF pf9x07j4fO7UUUUU= X-CM-SenderInfo: xkrx3t3r6k3tpzhluzxrxghudrp/ Content-Type: text/plain; charset="utf-8" From: Hou Tao P2PDMA memory has supported compound page. It is best to enable compound page for p2pdma memory automatically accordingly to the address, the size and the offset of the CMB, however, for nvme device, the p2pdma memory may be used in the kernel space (e.g., for SQ entries) and it will incur a lot of waste when a PUD or PMD-sized page is enabled for nvme device. Therefore, introduce a module parameter cmb_devmap_align to control the alignment of p2pdma memory mapping. Its default value is PAGE_SIZE. When its value is zero, it will use pci_p2pdma_max_pagemap_align() to find the maximal possible mapping alignment. Signed-off-by: Hou Tao --- drivers/nvme/host/pci.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c index b070095bae5e..ca0126e36834 100644 --- a/drivers/nvme/host/pci.c +++ b/drivers/nvme/host/pci.c @@ -79,6 +79,10 @@ static bool use_cmb_sqes =3D true; module_param(use_cmb_sqes, bool, 0444); MODULE_PARM_DESC(use_cmb_sqes, "use controller's memory buffer for I/O SQe= s"); =20 +static unsigned long cmb_devmap_align =3D PAGE_SIZE; +module_param(cmb_devmap_align, ulong, 0444); +MODULE_PARM_DESC(cmb_devmap_align, "the mapping alignment of CMB"); + static unsigned int max_host_mem_size_mb =3D 128; module_param(max_host_mem_size_mb, uint, 0444); MODULE_PARM_DESC(max_host_mem_size_mb, @@ -2266,6 +2270,7 @@ static void nvme_map_cmb(struct nvme_dev *dev) u64 size, offset; resource_size_t bar_size; struct pci_dev *pdev =3D to_pci_dev(dev->dev); + size_t align; int bar; =20 if (dev->cmb_size) @@ -2309,7 +2314,10 @@ static void nvme_map_cmb(struct nvme_dev *dev) dev->bar + NVME_REG_CMBMSC); } =20 - if (pci_p2pdma_add_resource(pdev, bar, size, PAGE_SIZE, offset)) { + align =3D cmb_devmap_align; + if (!align) + align =3D pci_p2pdma_max_pagemap_align(pdev, bar, size, offset); + if (pci_p2pdma_add_resource(pdev, bar, size, align, offset)) { dev_warn(dev->ctrl.device, "failed to register the CMB\n"); hi_lo_writeq(0, dev->bar + NVME_REG_CMBMSC); --=20 2.29.2 From nobody Mon Feb 9 08:59:50 2026 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AD6AD2F6193; Sat, 20 Dec 2025 04:16:19 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1766204182; cv=none; b=WXJSNcN+AVyaUzx7HqMXB+9MryqmmQLPtMa9SBZCl7tlka3aPccySaY9wpwEh0mWjR8nxB7wTTKjbcUFsCv7dHzyznioURxTIbe3VCxqbZ0r+E4JbCRCNd7T9jRDVkiq3katQpotgVX/97Zel7bOUCq3Vs7td1ecFmuzxqPoN7c= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1766204182; c=relaxed/simple; bh=xaD8kc2vinKmfPymOQ5h40gVLBPgeL9kEYg6duS4mBM=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=W/I953QYYT2IE48mmGjBA/WTEgCdaGT3qDR6eMdVkq8WDRHDXmsT8h/1lyvGDuNP+M+VNJIGL0WM1wjZkgWQZace9ay+0FPu0xlcH7ub8XQ17G8HR9ZeBbb0dLB76Bh2RlZxAvws7pjbDd0NPnj64RwAWSM2cfH5PEs0zuVLql8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=none smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.177]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTPS id 4dYB0845P2zYQtgg; Sat, 20 Dec 2025 12:15:36 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id E15964057D; Sat, 20 Dec 2025 12:16:07 +0800 (CST) Received: from huaweicloud.com (unknown [10.50.87.129]) by APP4 (Coremail) with SMTP id gCh0CgD3WPn5IkZpFwpFAw--.56015S17; Sat, 20 Dec 2025 12:16:07 +0800 (CST) From: Hou Tao To: linux-kernel@vger.kernel.org Cc: linux-pci@vger.kernel.org, linux-mm@kvack.org, linux-nvme@lists.infradead.org, Bjorn Helgaas , Logan Gunthorpe , Alistair Popple , Leon Romanovsky , Greg Kroah-Hartman , Tejun Heo , "Rafael J . Wysocki" , Danilo Krummrich , Andrew Morton , David Hildenbrand , Lorenzo Stoakes , Keith Busch , Jens Axboe , Christoph Hellwig , Sagi Grimberg , houtao1@huawei.com Subject: [PATCH 13/13] PCI/P2PDMA: enable compound page support for p2pdma memory Date: Sat, 20 Dec 2025 12:04:46 +0800 Message-Id: <20251220040446.274991-14-houtao@huaweicloud.com> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20251220040446.274991-1-houtao@huaweicloud.com> References: <20251220040446.274991-1-houtao@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: gCh0CgD3WPn5IkZpFwpFAw--.56015S17 X-Coremail-Antispam: 1UD129KBjvdXoWruF13Jw1xKFWxGw48Zw4fKrg_yoWDWwbEvF 4UZ34rXr4kGa42gFyqy34xZrW5tay8Wr1kWF4vya45ZFy09a1vvr1qvry5Xry8Wr97WF98 CasrCF1I9ryjkjkaLaAFLSUrUUUUjb8apTn2vfkv8UJUUUU8Yxn0WfASr-VFAUDa7-sFnT 9fnUUIcSsGvfJTRUUUbgAYFVCjjxCrM7AC8VAFwI0_Wr0E3s1l1xkIjI8I6I8E6xAIw20E Y4v20xvaj40_Wr0E3s1l1IIY67AEw4v_Jr0_Jr4l82xGYIkIc2x26280x7IE14v26r126s 0DM28IrcIa0xkI8VCY1x0267AKxVW5JVCq3wA2ocxC64kIII0Yj41l84x0c7CEw4AK67xG Y2AK021l84ACjcxK6xIIjxv20xvE14v26F1j6w1UM28EF7xvwVC0I7IYx2IY6xkF7I0E14 v26r4UJVWxJr1l84ACjcxK6I8E87Iv67AKxVW0oVCq3wA2z4x0Y4vEx4A2jsIEc7CjxVAF wI0_GcCE3s1le2I262IYc4CY6c8Ij28IcVAaY2xG8wAqx4xG64xvF2IEw4CE5I8CrVC2j2 WlYx0E2Ix0cI8IcVAFwI0_JF0_Jw1lYx0Ex4A2jsIE14v26r4UJVWxJr1lOx8S6xCaFVCj c4AY6r1j6r4UM4x0Y48IcxkI7VAKI48JM4IIrI8v6xkF7I0E8cxan2IY04v7MxkF7I0En4 kS14v26r4a6rW5MxAIw28IcxkI7VAKI48JMxC20s026xCaFVCjc4AY6r1j6r4UMI8I3I0E 5I8CrVAFwI0_Jr0_Jr4lx2IqxVCjr7xvwVAFwI0_JrI_JrWlx4CE17CEb7AF67AKxVW8ZV WrXwCIc40Y0x0EwIxGrwCI42IY6xIIjxv20xvE14v26r4j6ryUMIIF0xvE2Ix0cI8IcVCY 1x0267AKxVW8Jr0_Cr1UMIIF0xvE42xK8VAvwI8IcIk0rVWUJVWUCwCI42IY6I8E87Iv67 AKxVW8JVWxJwCI42IY6I8E87Iv6xkF7I0E14v26r4UJVWxJrUvcSsGvfC2KfnxnUUI43ZE Xa7IU0FJmUUUUUU== X-CM-SenderInfo: xkrx3t3r6k3tpzhluzxrxghudrp/ Content-Type: text/plain; charset="utf-8" From: Hou Tao Compound page support for P2PDMA memory in both kernel and user space is now in place. Enable it by allowing PUD_SIZE and PMD_SIZE alignment. Signed-off-by: Hou Tao --- drivers/pci/p2pdma.c | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/drivers/pci/p2pdma.c b/drivers/pci/p2pdma.c index 4a133219ac43..969bdacdcf8b 100644 --- a/drivers/pci/p2pdma.c +++ b/drivers/pci/p2pdma.c @@ -452,9 +452,19 @@ static inline int pci_p2pdma_check_pagemap_align(struc= t pci_dev *pdev, int bar, u64 size, size_t align, u64 offset) { + if (has_transparent_pud_hugepage() && align =3D=3D PUD_SIZE) + goto more_check; + if (has_transparent_hugepage() && align =3D=3D PMD_SIZE) + goto more_check; if (align =3D=3D PAGE_SIZE) return 0; return -EINVAL; + +more_check: + if (IS_ALIGNED(pci_resource_start(pdev, bar), align) && + IS_ALIGNED(size, align) && IS_ALIGNED(offset, align)) + return 0; + return -EINVAL; } =20 /** --=20 2.29.2