From nobody Wed Oct 8 10:02:58 2025 Received: from mail-pl1-f170.google.com (mail-pl1-f170.google.com [209.85.214.170]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AEE341E8837 for ; Mon, 30 Jun 2025 07:26:24 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.170 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1751268386; cv=none; b=fwz4gNUrhLjDfztNDsxHqxHrvo8aKTJEU6+7WyoaUaeh6+1odopHGpplcsQ7gVuORe20DbKjiw8ARRFkr5wdh6Gbx/8NQBWKBIbEv4cuT43dYurBdWWJtpvt8EvkKqGy4O6TXfZtFQwC+9dypuOtuYkEu069GsH3GKqM6qHYYP0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1751268386; c=relaxed/simple; bh=I8UxxhJLQof/tyH0p5dB+cMNEwpvSA6ZIOKFcqg3So8=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=lS7BcE0ZutfglysCmvxzfHSN7nU2u/1g2NN2NSYI5GWJ80RBzRKzAlJoxB1BEJ4901t8eGqXHgLJTplSDe6KGGThrdBBNnjr7TCQrNgyeB26FRIzkwI4z+UyMN6FMLLAq6FrNSlFXNT7SJfVb4LyKRFerMI9ajtWEJ8iXHGCiyw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=PYJU+RIT; arc=none smtp.client-ip=209.85.214.170 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="PYJU+RIT" Received: by mail-pl1-f170.google.com with SMTP id d9443c01a7336-234fcadde3eso20418465ad.0 for ; Mon, 30 Jun 2025 00:26:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1751268384; x=1751873184; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=8PVaNAK39uQ8LNi2b1TX8lzENExPvD+xSUsW8/NwSUM=; b=PYJU+RITWAUs7ZtAF/eYEpAC4io8FUSM9JmT8POa7YY/xgX4uE0wbuKtw7EGrhgrN6 A2RHVdSlDGRQm3JsUQwxuvhx9m82ZP+Zleyl1iDcm3hdZH/Bg/0LyHJpbEjlpFijtV13 yv0ir++rnccW5aA9SY9MqrBvtJQNSW3/3u8tDcHO9g56cV/usICWhEQkRnsb+x5fmGeI NfkmsaTY1A8yY1UQx4wWZdvIFFsv9xKJA7nLXdURWW0EOd4/AWVlHwD9AhB7SvEkhGTi r9+3sc+RHEOIATumWwPBsYrHij14fw1BQb6sO+33DB89PWwcRYkGbiLMsilL/YoqMolo l4XQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1751268384; x=1751873184; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=8PVaNAK39uQ8LNi2b1TX8lzENExPvD+xSUsW8/NwSUM=; b=FZ8NMnAtIs8ExoH0CJF7j3nFl9b9yQnc7Wo89Sc5x1MOYQ562Jgn8dpwMd8eOgz2T+ s/ag2Bweb3HYAUOUk7ZqANqLLZP5g12FwVOofaB/8DCpO+pSP9q9bGFrJneUs/Y7BRix riS6QgiuTW14mVidH7n9HPMwI6pafMSdWLCY6/0PQx+tS2jIqxLKBCkjZMbQ0fcqHl0p L0+yUwZo7yI20eYlow98SVmdqPDJNoQ20dhfSLg195kKzhIcYiIOaXVuDKGaTqUuye/c 9Ta2OG+QnPsh0LzkQnUTgwvH4c4x27YVQr73QxHeFTQeYYoHpdKe5gF5M1RTzO/MY4Nw c88A== X-Forwarded-Encrypted: i=1; AJvYcCXtu/5sNKhPCyma1kI5V8QLdAyOucb9QJbs4mXSk9nAdHJJ7SjpkZaD9FLWMpIigo8jyvK/HJjeQ7ODFmg=@vger.kernel.org X-Gm-Message-State: AOJu0Yy/pGAqGW0dNTBHcKDEZXOPyuyQc4jnAE9gnYxl6PAfdzPkYlUe AgA/lf6SCCAxe19voT2EcpaGnsATSlkZ2mKZ4EUt7+JMgW7GrxAINOFOuamGDabv4dY= X-Gm-Gg: ASbGncvch7kOPWtalePRrHStqcJuRLhYTEzPHf7SrXedYXdKPp7ove3Sp0wkyP7Au4e FdDqwG6K92IbDjinrJy3EsDf61UyyRi98at2A6pI+hcqO/nIx1xmM5ZD6wU3bznT9Y2fppa86x0 eXR6Xbx2tYlVeSlBztRUIcWjwcKPqUigrBwgwgbmrmx6g6ytapMUbBcNS5mBLUYqd1mIlCSi9V3 v56jtvWapNlJVxBa0nsjm0UDvOBmUZfmYrTPaP0ptscEqtvcAxzj9Bd5KhjkvhpytaY/O1Rx5qr z/kvsvDpmNxSRkohRHCK3JbZappXcHaDavVTrN32sy7rylQ7gzj6q/0k4kx4A5jD4tfh5rESsd9 x+i8PC+YhRfgkAg== X-Google-Smtp-Source: AGHT+IET0HnvzCWLV3S27jQoYp87eESrgMK8N5WMX/2EoH+qyQ0Aoqvn9euxjP43pOMWjkx9W6YRAA== X-Received: by 2002:a17:903:1ad0:b0:235:f18f:2924 with SMTP id d9443c01a7336-23ac40dc567mr176220125ad.15.1751268383947; Mon, 30 Jun 2025 00:26:23 -0700 (PDT) Received: from localhost.localdomain ([203.208.189.13]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-23acb2f17f5sm77237555ad.62.2025.06.30.00.26.20 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Mon, 30 Jun 2025 00:26:23 -0700 (PDT) From: lizhe.67@bytedance.com To: alex.williamson@redhat.com, jgg@ziepe.ca, david@redhat.com, peterx@redhat.com Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, lizhe.67@bytedance.com Subject: [PATCH 1/4] vfio/type1: optimize vfio_pin_pages_remote() for large folios Date: Mon, 30 Jun 2025 15:25:15 +0800 Message-ID: <20250630072518.31846-2-lizhe.67@bytedance.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20250630072518.31846-1-lizhe.67@bytedance.com> References: <20250630072518.31846-1-lizhe.67@bytedance.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Li Zhe When vfio_pin_pages_remote() is called with a range of addresses that includes large folios, the function currently performs individual statistics counting operations for each page. This can lead to significant performance overheads, especially when dealing with large ranges of pages. This patch optimize this process by batching the statistics counting operations. The performance test results for completing the 16G VFIO IOMMU DMA mapping are as follows. Base(v6.16-rc4): ------- AVERAGE (MADV_HUGEPAGE) -------- VFIO MAP DMA in 0.047 s (340.2 GB/s) ------- AVERAGE (MAP_POPULATE) -------- VFIO MAP DMA in 0.280 s (57.2 GB/s) ------- AVERAGE (HUGETLBFS) -------- VFIO MAP DMA in 0.052 s (310.5 GB/s) With this patch: ------- AVERAGE (MADV_HUGEPAGE) -------- VFIO MAP DMA in 0.027 s (596.5 GB/s) ------- AVERAGE (MAP_POPULATE) -------- VFIO MAP DMA in 0.290 s (55.2 GB/s) ------- AVERAGE (HUGETLBFS) -------- VFIO MAP DMA in 0.031 s (511.1 GB/s) For large folio, we achieve an over 40% performance improvement. For small folios, the performance test results indicate a particularly minor performance drop. Signed-off-by: Li Zhe Co-developed-by: Alex Williamson Signed-off-by: Alex Williamson --- drivers/vfio/vfio_iommu_type1.c | 93 ++++++++++++++++++++++++++++----- 1 file changed, 81 insertions(+), 12 deletions(-) diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type= 1.c index 1136d7ac6b59..a2d7abd4f2c2 100644 --- a/drivers/vfio/vfio_iommu_type1.c +++ b/drivers/vfio/vfio_iommu_type1.c @@ -318,7 +318,13 @@ static void vfio_dma_bitmap_free_all(struct vfio_iommu= *iommu) /* * Helper Functions for host iova-pfn list */ -static struct vfio_pfn *vfio_find_vpfn(struct vfio_dma *dma, dma_addr_t io= va) + +/* + * Find the highest vfio_pfn that overlapping the range + * [iova_start, iova_end) in rb tree. + */ +static struct vfio_pfn *vfio_find_vpfn_range(struct vfio_dma *dma, + dma_addr_t iova_start, dma_addr_t iova_end) { struct vfio_pfn *vpfn; struct rb_node *node =3D dma->pfn_list.rb_node; @@ -326,9 +332,9 @@ static struct vfio_pfn *vfio_find_vpfn(struct vfio_dma = *dma, dma_addr_t iova) while (node) { vpfn =3D rb_entry(node, struct vfio_pfn, node); =20 - if (iova < vpfn->iova) + if (iova_end <=3D vpfn->iova) node =3D node->rb_left; - else if (iova > vpfn->iova) + else if (iova_start > vpfn->iova) node =3D node->rb_right; else return vpfn; @@ -336,6 +342,11 @@ static struct vfio_pfn *vfio_find_vpfn(struct vfio_dma= *dma, dma_addr_t iova) return NULL; } =20 +static inline struct vfio_pfn *vfio_find_vpfn(struct vfio_dma *dma, dma_ad= dr_t iova) +{ + return vfio_find_vpfn_range(dma, iova, iova + PAGE_SIZE); +} + static void vfio_link_pfn(struct vfio_dma *dma, struct vfio_pfn *new) { @@ -614,6 +625,56 @@ static long vaddr_get_pfns(struct mm_struct *mm, unsig= ned long vaddr, return ret; } =20 +static long contig_pages(struct vfio_dma *dma, + struct vfio_batch *batch, dma_addr_t iova) +{ + struct page *page =3D batch->pages[batch->offset]; + struct folio *folio =3D page_folio(page); + long idx =3D folio_page_idx(folio, page); + long max =3D min_t(long, batch->size, folio_nr_pages(folio) - idx); + long nr_pages; + + for (nr_pages =3D 1; nr_pages < max; nr_pages++) { + if (batch->pages[batch->offset + nr_pages] !=3D + folio_page(folio, idx + nr_pages)) + break; + } + + return nr_pages; +} + +static long vpfn_pages(struct vfio_dma *dma, + dma_addr_t iova_start, long nr_pages) +{ + dma_addr_t iova_end =3D iova_start + (nr_pages << PAGE_SHIFT); + struct vfio_pfn *top =3D vfio_find_vpfn_range(dma, iova_start, iova_end); + long ret =3D 1; + struct vfio_pfn *vpfn; + struct rb_node *prev; + struct rb_node *next; + + if (likely(!top)) + return 0; + + prev =3D next =3D &top->node; + + while ((prev =3D rb_prev(prev))) { + vpfn =3D rb_entry(prev, struct vfio_pfn, node); + if (vpfn->iova < iova_start) + break; + ret++; + } + + while ((next =3D rb_next(next))) { + vpfn =3D rb_entry(next, struct vfio_pfn, node); + if (vpfn->iova >=3D iova_end) + break; + ret++; + } + + return ret; +} + /* * Attempt to pin pages. We really don't want to track all the pfns and * the iommu can only map chunks of consecutive pfns anyway, so get the @@ -680,32 +741,40 @@ static long vfio_pin_pages_remote(struct vfio_dma *dm= a, unsigned long vaddr, * and rsvd here, and therefore continues to use the batch. */ while (true) { + long nr_pages, acct_pages =3D 0; + if (pfn !=3D *pfn_base + pinned || rsvd !=3D is_invalid_reserved_pfn(pfn)) goto out; =20 + nr_pages =3D contig_pages(dma, batch, iova); + if (!rsvd) { + acct_pages =3D nr_pages; + acct_pages -=3D vpfn_pages(dma, iova, nr_pages); + } + /* * Reserved pages aren't counted against the user, * externally pinned pages are already counted against * the user. */ - if (!rsvd && !vfio_find_vpfn(dma, iova)) { + if (acct_pages) { if (!dma->lock_cap && - mm->locked_vm + lock_acct + 1 > limit) { + mm->locked_vm + lock_acct + acct_pages > limit) { pr_warn("%s: RLIMIT_MEMLOCK (%ld) exceeded\n", __func__, limit << PAGE_SHIFT); ret =3D -ENOMEM; goto unpin_out; } - lock_acct++; + lock_acct +=3D acct_pages; } =20 - pinned++; - npage--; - vaddr +=3D PAGE_SIZE; - iova +=3D PAGE_SIZE; - batch->offset++; - batch->size--; + pinned +=3D nr_pages; + npage -=3D nr_pages; + vaddr +=3D PAGE_SIZE * nr_pages; + iova +=3D PAGE_SIZE * nr_pages; + batch->offset +=3D nr_pages; + batch->size -=3D nr_pages; =20 if (!batch->size) break; --=20 2.20.1 From nobody Wed Oct 8 10:02:58 2025 Received: from mail-pl1-f171.google.com (mail-pl1-f171.google.com [209.85.214.171]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 261B921CC7B for ; Mon, 30 Jun 2025 07:26:29 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.171 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1751268391; cv=none; b=EjXpsQmuaFfnScdL17OB5BLaHpoCzJQv6y78hbZ9XZrOrGY4QPptCu89ihNwzIlbgkTFJUZ1a2cFw91X03z6Psm28K+KJoDANnjvqdntVL3rduDDwhtNE/eFT1k+8v9THmzq9/oXwudxVd/fCq01NmEwHFw3QvnIujBdwKI1HZ8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1751268391; c=relaxed/simple; bh=TLqFs73Rs07kvj6q6zHaiOwcjlhs5jMzOSpM5JzlbZc=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=BRCq3Oe+ZvI++ENkYFsPzhRUAZhuYEWHMOsk5jSW0+/Jk+mGJS/71rYNtCQAPMz1clnUJmaofI9+jC2Gj+d8iCksL6y2zYER7HWYczYHgoudUehPU1PlsXyMMYEepTuFJkRjtxQzHIF8vPccpoRjxYrzvq5rWI0eTuXpA40IeA0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=b1CkWVOj; arc=none smtp.client-ip=209.85.214.171 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="b1CkWVOj" Received: by mail-pl1-f171.google.com with SMTP id d9443c01a7336-235e1d710d8so45732885ad.1 for ; Mon, 30 Jun 2025 00:26:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1751268389; x=1751873189; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=gL8Na4RbOeb+mTd/Y2il+QsiGyvEE+fewWwpBjRUgYM=; b=b1CkWVOjwJstnDkRN7pIgbg08Ml4IVNzxRjj4IEWq+ylDq/v7pMzQK2bJniMPNmck5 mq03sYLX43ZpRJOzY7GZBoYtoFZSx9ITZXAiwa/fVS/N1pVdMJZdzBdtJAc81BFAnyr0 sde4llCtWC657DuFIQKmtrt3xCSCRbQpB+cz/87YHwcPgx459hIL06nb11DscmBHRrWW zygXmSFac/Wpl6Y1HfJwI4evAcftAc/0ealbo3RJIdCcV+17Z/TeeJ2o3b1JOKBVTa08 xVTfT3iSF2R0T0rzsRX6orA3Oj6lXbAVhmXfr4cX7kFeYJ/hRmz0I5mmw1ez/CxBFUpt dn8w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1751268389; x=1751873189; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=gL8Na4RbOeb+mTd/Y2il+QsiGyvEE+fewWwpBjRUgYM=; b=pb8HE43zSwG/rff1T8E+rSc2oGhltn1F//xcgCFMYP2RSPng8hQKrSeHiOdZQhPPa8 nOTP+54HxNMID/2Wik2WBUZ+3OsBuB+D9EISaI4cUHtS4zoetDswQnSSEoFGX4FVGMlw Vk2vbx0log2XZYHrxVKQUq1TCz0fEkD/MVoY3s5bokfCMAIHT6RMZLbH+Pphx/nduOVF rVFzFe0D16GgiEt7ETcd/pZd5MLEbr40AQgDUfBV3aRet2w45RKAsOW9aBa/qPnX/Ngs yPnggqyPAIf7DOpNRfKGRsMwE7u8rWAjX5QjtyUZcm8gTDKt22FYPPaDdH4CGbdDe+wx EkoA== X-Forwarded-Encrypted: i=1; AJvYcCXQq3CL9YBBw6VQLrIwltQL7wnu3/Q78ZkDrV03ilWWiFn4WVznl7lGXVLiklpuqaTGaB+XoxJPwNJhk6w=@vger.kernel.org X-Gm-Message-State: AOJu0YzNUZknnyIOfw2fIJU3Zyp9VefT0vyAhMtbwHYc993+TGyQE1Vt 4xiUejdmjTUdkNCHY8jASL4A6kis8rDD8I1VYJ9oIF82QSA1pPzP02HuHv6zo8zgx5E= X-Gm-Gg: ASbGnctWjZ1jo6OmaRDrNfQKoRRQWcHK9Xt8LbG6WV3kZN4TzkoSEga5yhBA3o3MFDv fzaG2g5mBoAlsPQZE3zr8OUnK3/WEqc36cu9okymZttzBtgvp79BFHZwu1G8F/1IhA9nPB6mPL/ OXjyYstmI8FhRU1eZl9ift14Y4xSjy66w7fru9aZ7E7ezdJmfbXHG+Y74cbJp7hYbNJBNz+Wdq8 yjfsnIFMdcHWIGUE6lptsAfsETVIBaqzzKc4lsx1S9NXgtt1bD7QgGLen4femRys6j2UiPDBXdz f6hN2JjVJrV1x80qZJyElrJC89Ran5cg0peTRu/ksMNbjBRlTdBG4YWE/ZwxMmqGY5IBY96N05P tu4IMm3aLgiQJEQ== X-Google-Smtp-Source: AGHT+IFQElYkwq2G8k4uAgKb3u+FK4KZxl8VJVnmY3Eo87tiNYXeh7dgUSv1GRb4rJwPsH61XbRT/Q== X-Received: by 2002:a17:903:41c8:b0:235:ed01:18cd with SMTP id d9443c01a7336-23ac4653ff7mr243268295ad.44.1751268389195; Mon, 30 Jun 2025 00:26:29 -0700 (PDT) Received: from localhost.localdomain ([203.208.189.13]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-23acb2f17f5sm77237555ad.62.2025.06.30.00.26.26 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Mon, 30 Jun 2025 00:26:28 -0700 (PDT) From: lizhe.67@bytedance.com To: alex.williamson@redhat.com, jgg@ziepe.ca, david@redhat.com, peterx@redhat.com Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, lizhe.67@bytedance.com Subject: [PATCH 2/4] vfio/type1: batch vfio_find_vpfn() in function vfio_unpin_pages_remote() Date: Mon, 30 Jun 2025 15:25:16 +0800 Message-ID: <20250630072518.31846-3-lizhe.67@bytedance.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20250630072518.31846-1-lizhe.67@bytedance.com> References: <20250630072518.31846-1-lizhe.67@bytedance.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Li Zhe The function vpfn_pages() can help us determine the number of vpfn nodes on the vpfn rb tree within a specified range. This allows us to avoid searching for each vpfn individually in the function vfio_unpin_pages_remote(). This patch batches the vfio_find_vpfn() calls in function vfio_unpin_pages_remote(). Signed-off-by: Li Zhe --- drivers/vfio/vfio_iommu_type1.c | 10 +++------- 1 file changed, 3 insertions(+), 7 deletions(-) diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type= 1.c index a2d7abd4f2c2..330fff4fe96d 100644 --- a/drivers/vfio/vfio_iommu_type1.c +++ b/drivers/vfio/vfio_iommu_type1.c @@ -804,16 +804,12 @@ static long vfio_unpin_pages_remote(struct vfio_dma *= dma, dma_addr_t iova, unsigned long pfn, unsigned long npage, bool do_accounting) { - long unlocked =3D 0, locked =3D 0; + long unlocked =3D 0, locked =3D vpfn_pages(dma, iova, npage); long i; =20 - for (i =3D 0; i < npage; i++, iova +=3D PAGE_SIZE) { - if (put_pfn(pfn++, dma->prot)) { + for (i =3D 0; i < npage; i++) + if (put_pfn(pfn++, dma->prot)) unlocked++; - if (vfio_find_vpfn(dma, iova)) - locked++; - } - } =20 if (do_accounting) vfio_lock_acct(dma, locked - unlocked, true); --=20 2.20.1 From nobody Wed Oct 8 10:02:58 2025 Received: from mail-pl1-f182.google.com (mail-pl1-f182.google.com [209.85.214.182]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E281421CC7B for ; Mon, 30 Jun 2025 07:26:34 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.182 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1751268396; cv=none; b=CQtRATEREIitShFnbN+vTbTbUvLizDbDStyuy/ogKKkOtsgvHiBARjrTzBI1vm89CrYW9pe/G3DpLiQSfC71KmdAYg94aHvBLAtBgnkEN+T8Ab/7sDlvAWzEDwcsCXrK0PayRq1ibwriS73pljD8fhog605UYag5DUB63RbR9WU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1751268396; c=relaxed/simple; bh=OXOAdqxVIp9HtLsu1MvyrpEwHF6RN9H7pCkvmJZLspY=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=e27vLwLSh7dngrxeHg9DSRObU59pEjCCJyJNxINw3jKo6MuqHzkaxa0czil9hRCtCfZdMD8LEPRbcnhLjz2LW/3FieK0/1n59VdmC7qtf8yHYNPChGBNzYhXPiX4P6AGlAuwUUbbwtTg4oE8suuoawuJOUoyyfMm2Bh2yYyN3Cg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=NRpG5mHQ; arc=none smtp.client-ip=209.85.214.182 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="NRpG5mHQ" Received: by mail-pl1-f182.google.com with SMTP id d9443c01a7336-23694cec0feso38446345ad.2 for ; Mon, 30 Jun 2025 00:26:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1751268394; x=1751873194; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=3/hNQa9KjFyMpC2ihH2SV+iZEz4QS98Qql4RbXjx0BM=; b=NRpG5mHQEzq/dAot1ADX0cryiQcBGIOi4f/9WXSuazrMYPIHb6bcBkKwafmDaIHza2 3SL6prn28028s0RgCbiQF6v+2MqriZ2v4oYfoFxjyqVirWfuvPLY4QPb26X34nMezqu8 dnOF/YSvYIvbevEt+2S/kMn6D7MVxGdWnjzL4+Y3KuZ7yJKX74iZVHmRC8FH8qgk2WM+ iI0A5MNsBFE1aLBtRc7n+FDjN4J6U07NtXLj3wd7qt/F/KWfZaSqqdPWhua2oYgfAWHe diHZ7zn11XHgIzyKF/wTA3McVmbMAAfhnRQvwR0vRKyMDZi5ajBoFqTFjbguAuJBj1jO eOqA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1751268394; x=1751873194; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=3/hNQa9KjFyMpC2ihH2SV+iZEz4QS98Qql4RbXjx0BM=; b=rsAD4tQV///JFZq+9C0S8qTh1DD0kLkLpWToDIMMZ2j3TOsoUlqNFRMlp3GBxL3frZ R3w5K+tv2QYynaMXgcpXp0Lgt+iqjSdW15MpKf3he35vVS7OCqoHdzjDad0MrGJhUl5d BIkma0hrXvRHA+tx8qg3kOd9r15trj7hvSqPUUMgkF2Eu8f53kZxVbtDMMXPqJPl3yZZ /JNAIdIPP7vSmtzQITuNtu56q+e7vCtkVEGeCZUy+uQz3LeK96jKGdDqdRut+y2r2ijr oYdRrqSkJxQPp0DH5XAwkAAaQ/PMPP7uxS8ToCqEl6Sa+wZXy40GkLXOVtQ8sq27XfeS B/hA== X-Forwarded-Encrypted: i=1; AJvYcCWK7VISNcfYvQ3xkjZstl1Nqlnzcix1xOY6dos0mn2sq7z1Z2AyXrtC18xu99v0cPRJQr5meZZ7jdlSKtM=@vger.kernel.org X-Gm-Message-State: AOJu0YxEM4ULHieqs9saJmJjckVISAmSyggPuxH/jp/Mi/ye3jXorYmW NR85RVmH4yX+MzfcCdYsQYWtgo+rPo4HdjUGWYNWGYg0tk3dxoWe22m+Vth+VKR09B8= X-Gm-Gg: ASbGnct5bL3kDiAkKyt/ySG3D4oI90edI6QVDFcMU3TzMvmPXH2Uwyr+jGIWEhVs4cM /Tm2bT5u62jrqH3sKA0mXNLbAN/IPEfn/dwz1eYAhmAEPYMTGxS9X51kZxN8zlm43aQ6dcj4uT8 /UkExxWI2WtSKcjzt4HSO1IYOtBfgmVeFfYhQVlZhQapN7ywIJn3wmaGrRAcpvvOhjh5NsViR+k dutJ2FYgHPxD96A1T446/Smy3cLiQFKOYFGoPzOwH7URuXpagUzsxYqopwcgsMtCuFi0da33hX5 8hdlaC7gRxfFKBXQz7xWSWu1PsOOMb4zczdgDod7yweD23w0aKJ5+aR2J0wza+RYdTg5uliD8QZ 8VSeWWxjLqb1KyQ== X-Google-Smtp-Source: AGHT+IGAD20yxGlVCMQbeNZIjI7hSWnCpaZJZIRII2/DKq9LAZjm63OyWH2Q4nV6p6WIReEmgJrW+g== X-Received: by 2002:a17:902:f745:b0:220:ea90:191e with SMTP id d9443c01a7336-23ac3cf5451mr169718755ad.4.1751268394203; Mon, 30 Jun 2025 00:26:34 -0700 (PDT) Received: from localhost.localdomain ([203.208.189.13]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-23acb2f17f5sm77237555ad.62.2025.06.30.00.26.31 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Mon, 30 Jun 2025 00:26:33 -0700 (PDT) From: lizhe.67@bytedance.com To: alex.williamson@redhat.com, jgg@ziepe.ca, david@redhat.com, peterx@redhat.com Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, lizhe.67@bytedance.com Subject: [PATCH 3/4] vfio/type1: introduce a new member has_rsvd for struct vfio_dma Date: Mon, 30 Jun 2025 15:25:17 +0800 Message-ID: <20250630072518.31846-4-lizhe.67@bytedance.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20250630072518.31846-1-lizhe.67@bytedance.com> References: <20250630072518.31846-1-lizhe.67@bytedance.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Li Zhe Introduce a new member has_rsvd for struct vfio_dma. This member is used to indicate whether there are any reserved or invalid pfns in the region represented by this vfio_dma. If it is true, it indicates that there is at least one pfn in this region that is either reserved or invalid. Signed-off-by: Li Zhe --- drivers/vfio/vfio_iommu_type1.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type= 1.c index 330fff4fe96d..a02bc340c112 100644 --- a/drivers/vfio/vfio_iommu_type1.c +++ b/drivers/vfio/vfio_iommu_type1.c @@ -92,6 +92,7 @@ struct vfio_dma { bool iommu_mapped; bool lock_cap; /* capable(CAP_IPC_LOCK) */ bool vaddr_invalid; + bool has_rsvd; /* has 1 or more rsvd pfns */ struct task_struct *task; struct rb_root pfn_list; /* Ex-user pinned pfn list */ unsigned long *bitmap; @@ -784,6 +785,7 @@ static long vfio_pin_pages_remote(struct vfio_dma *dma,= unsigned long vaddr, } =20 out: + dma->has_rsvd |=3D rsvd; ret =3D vfio_lock_acct(dma, lock_acct, false); =20 unpin_out: --=20 2.20.1 From nobody Wed Oct 8 10:02:58 2025 Received: from mail-pl1-f170.google.com (mail-pl1-f170.google.com [209.85.214.170]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9D93421D581 for ; Mon, 30 Jun 2025 07:26:43 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.170 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1751268405; cv=none; b=hYO5BhtbUwFMVttb3d+zp6J0TeAb34qi6FNp18qk7/WRY5W7Ee4P22b/EfdkYXpKKODZblon/OqouucadtUHpk8oomSDRoo97eTGUw/5KJzLlJe5iJB6x/G5oeSN5nJbSLSsR68wrn792100GPt+4LPu1KrR9rCMxt2wa0zWCxQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1751268405; c=relaxed/simple; bh=O4w6fa0j1NT8Z5/eHPCKeH/P1kt8Jwrp4C3VJaFngos=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=cuFShn42X//mL418nvP0tkm7zy5XIRqys8LA42eVTm+g35COyWa8jHMyV2Tj1eNUzRZBFQwEq7nrLZe5eCAHf0QkkoCKlt/LSUogQsHEhu3Z04mqHz+wLmfPaaDE20DvlALZLSrJzUsOyHD3VNGf7Nv/jh5Ww9eR9HdQWFQdFWk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=YfrrV2jK; arc=none smtp.client-ip=209.85.214.170 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="YfrrV2jK" Received: by mail-pl1-f170.google.com with SMTP id d9443c01a7336-2349f096605so17381515ad.3 for ; Mon, 30 Jun 2025 00:26:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1751268403; x=1751873203; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=tTkBlAlBXrTrE11XYZHesr9ptEyGqngcq3YAn3h6NUs=; b=YfrrV2jKJnmPUCBB/HDP1QsWE3FBK6X26uKu2+Sog+R7C+6qoRYoyHm2xQalBVscus Ruv4OFrGgKRlUCM3yc+/wAJj6Ie0rkClfTejH6tO0LDX62X9UhJcyWk6u84iPYq2H2fj jPi/mWI65EpoBDFIu478cxNWWHkLNYEAsSDIpXm2b31txUgn2K0Ft0ubOXzY16oN7VSI GGGxl6gyMWEpomzvIAojRn4195YJEoWpDtkaBBstGVmUwzB6nPLYnD5aKZLNfMYjoGex BirGeLGPPGiVXPhY1fb1tuuYXpDk4kekeAYongKMVb/YYj+eBtSkMbzP7oawjG3+SeYJ U6JQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1751268403; x=1751873203; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=tTkBlAlBXrTrE11XYZHesr9ptEyGqngcq3YAn3h6NUs=; b=FVt+cbiMKOBzeXxOdrFk45mM+f/Clf1zIfTdMCVQbWbeyFrhKiUYRWgc53/H7NtEjR vZ1/g7yZHgjIfy7MQzYX8L70p6LYEhH5/H5JVvMBvJwvpSjHQ/Au4jCYRu+V512fLFdp fL5VNacseOcJj0223Wz2isT5tu12/tIVZpNhlKepxHd5NRnbiGuNi4T9q4hNVG/BIX0a qro3UmzzJ9TcQXplDpiB1TLuZ6NKmMXCD4k+DIXvVYJoHZL7WnZwh09EPqd9GPp1MOX2 fIpFoQqj6qP5/RlKf24fLIoZVMSTDNRapAU95Pq7MUlESQ5BECv6wEMF9urE7Egei+b7 l1nA== X-Forwarded-Encrypted: i=1; AJvYcCWPFaBUwP0Rv/gUPT18OCvSmD4QMkIOfLFCF9WLCZ7SsYuhwr8WPRFDYPZhqE2yLX5l1BfJ3SQcYfM2zMM=@vger.kernel.org X-Gm-Message-State: AOJu0Yz1Tbn8PseJ+3jMQqqndkEcV2d7aij4evfkduAGSNkv87SYrCIZ H+mh3EiZx9GwXPtBex+IWH+nOHKsNEjt8hXhg23DRBdpmfceUXudqTjtvXrXOOMiX7Q= X-Gm-Gg: ASbGncukkm7E9GT1L7aqD2IrGtJQCwjTZwc5j/nS1L685ZgM7bG7ZQnQsoSIDzSVgZM CxlsPsEUPTWWJvY8kmoBlQxh0b+wfVOwVI0hwsFANO+qpeE7gd3gE5wtJ19L2mS0ZgJlM2jcvsO dDbCoYWxAMkUQHLlDCkuW+0X3n+T4J4PlIt3J2bImWHjwmRDQ8g/M+kLJZINyEXkUrZa7vYynFf udBmhMRtocr5EliBxf6ajozpoAdJJ4mN2i/dMl7khY1jCheDJwJJ4iJE8Q9/Dc6G6JlLg0p6EMF vc5S4RQUa5guv/P4KDGWSnopzYuWHUPxYckqpEb0k+4B70hVk7hXVwPS7AxMJCWuR+A7awkNmOF ZCfOvTK2UBXD4BQ== X-Google-Smtp-Source: AGHT+IFBCqGaXb2wsjiFpr47TvqlYdO9gHIGqoEwdx+sQqMU1WhGJBEpSCh+DRjSvJWY8WhE0RQaSQ== X-Received: by 2002:a17:902:ce8b:b0:224:23be:c569 with SMTP id d9443c01a7336-23ac45c1d32mr207282295ad.22.1751268403012; Mon, 30 Jun 2025 00:26:43 -0700 (PDT) Received: from localhost.localdomain ([203.208.189.13]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-23acb2f17f5sm77237555ad.62.2025.06.30.00.26.39 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Mon, 30 Jun 2025 00:26:42 -0700 (PDT) From: lizhe.67@bytedance.com To: alex.williamson@redhat.com, jgg@ziepe.ca, david@redhat.com, peterx@redhat.com Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, lizhe.67@bytedance.com Subject: [PATCH 4/4] vfio/type1: optimize vfio_unpin_pages_remote() for large folio Date: Mon, 30 Jun 2025 15:25:18 +0800 Message-ID: <20250630072518.31846-5-lizhe.67@bytedance.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20250630072518.31846-1-lizhe.67@bytedance.com> References: <20250630072518.31846-1-lizhe.67@bytedance.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Li Zhe When vfio_unpin_pages_remote() is called with a range of addresses that includes large folios, the function currently performs individual put_pfn() operations for each page. This can lead to significant performance overheads, especially when dealing with large ranges of pages. It would be very rare for reserved PFNs and non reserved will to be mixed within the same range. So this patch utilizes the has_rsvd variable introduced in the previous patch to determine whether batch put_pfn() operations can be performed. Moreover, compared to put_pfn(), unpin_user_page_range_dirty_lock() is capable of handling large folio scenarios more efficiently. The performance test results for completing the 16G VFIO IOMMU DMA unmapping are as follows. Base(v6.16-rc4): ./vfio-pci-mem-dma-map 0000:03:00.0 16 Suggested-by: Jason Gunthorpe ------- AVERAGE (MADV_HUGEPAGE) -------- VFIO UNMAP DMA in 0.135 s (118.6 GB/s) ------- AVERAGE (MAP_POPULATE) -------- VFIO UNMAP DMA in 0.312 s (51.3 GB/s) ------- AVERAGE (HUGETLBFS) -------- VFIO UNMAP DMA in 0.136 s (117.3 GB/s) With this patchset: ------- AVERAGE (MADV_HUGEPAGE) -------- VFIO UNMAP DMA in 0.045 s (357.6 GB/s) ------- AVERAGE (MAP_POPULATE) -------- VFIO UNMAP DMA in 0.288 s (55.6 GB/s) ------- AVERAGE (HUGETLBFS) -------- VFIO UNMAP DMA in 0.045 s (352.9 GB/s) For large folio, we achieve an over 66% performance improvement in the VFIO UNMAP DMA item. For small folios, the performance test results appear to show no significant changes. Suggested-by: Jason Gunthorpe Signed-off-by: Li Zhe --- drivers/vfio/vfio_iommu_type1.c | 20 ++++++++++++++++---- 1 file changed, 16 insertions(+), 4 deletions(-) diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type= 1.c index a02bc340c112..7cacfb2cefe3 100644 --- a/drivers/vfio/vfio_iommu_type1.c +++ b/drivers/vfio/vfio_iommu_type1.c @@ -802,17 +802,29 @@ static long vfio_pin_pages_remote(struct vfio_dma *dm= a, unsigned long vaddr, return pinned; } =20 +static inline void put_valid_unreserved_pfns(unsigned long start_pfn, + unsigned long npage, int prot) +{ + unpin_user_page_range_dirty_lock(pfn_to_page(start_pfn), npage, + prot & IOMMU_WRITE); +} + static long vfio_unpin_pages_remote(struct vfio_dma *dma, dma_addr_t iova, unsigned long pfn, unsigned long npage, bool do_accounting) { long unlocked =3D 0, locked =3D vpfn_pages(dma, iova, npage); - long i; =20 - for (i =3D 0; i < npage; i++) - if (put_pfn(pfn++, dma->prot)) - unlocked++; + if (dma->has_rsvd) { + long i; =20 + for (i =3D 0; i < npage; i++) + if (put_pfn(pfn++, dma->prot)) + unlocked++; + } else { + put_valid_unreserved_pfns(pfn, npage, dma->prot); + unlocked =3D npage; + } if (do_accounting) vfio_lock_acct(dma, locked - unlocked, true); =20 --=20 2.20.1