Hi Jason, Leon, I was able to fix the issue from my side. Things work fine now. I got two questions though: 1) The value returned from dma_link_range function is not contiguous, see below print. The "linked pa" is the function return. I think dma_map_sgtable API would return some contiguous dma address. Is the dma-map_sgtable api is more efficient regarding the iommu page table? i.e., try to use bigger page size, such as use 2M page size when it is possible. With your new API, does it also have such consideration? I vaguely remembered Jason mentioned such thing, but my print below doesn't look like so. Maybe I need to test bigger range (only 16 pages range in the test of below printing). Comment? [17584.665126] drm_svm_hmmptr_map_dma_pages iova.dma_addr = 0x0, linked pa = 18ef3f000 [17584.665146] drm_svm_hmmptr_map_dma_pages iova.dma_addr = 0x0, linked pa = 190d00000 [17584.665150] drm_svm_hmmptr_map_dma_pages iova.dma_addr = 0x0, linked pa = 190024000 [17584.665153] drm_svm_hmmptr_map_dma_pages iova.dma_addr = 0x0, linked pa = 178e89000 2) in the comment of dma_link_range function, it is said: " @dma_offset needs to be advanced by the caller with the size of previous page that was linked + DMA address returned for the previous page". Is this description correct? I don't understand the part "+ DMA address returned for the previous page ". In my codes, let's say I call this function to link 10 pages, the first dma_offset is 0, second is 4k, third 8k. This worked for me. I didn't add the previously returned dma address. Maybe I need more test. But any comment? Thanks, Oak > -----Original Message----- > From: Jason Gunthorpe <jgg@ziepe.ca> > Sent: Monday, June 10, 2024 1:25 PM > To: Zeng, Oak <oak.zeng@intel.com> > Cc: Leon Romanovsky <leon@kernel.org>; Christoph Hellwig <hch@lst.de>; > Robin Murphy <robin.murphy@arm.com>; Marek Szyprowski > <m.szyprowski@samsung.com>; Joerg Roedel <joro@8bytes.org>; Will > Deacon <will@kernel.org>; Chaitanya Kulkarni <chaitanyak@nvidia.com>; > Brost, Matthew <matthew.brost@intel.com>; Hellstrom, Thomas > <thomas.hellstrom@intel.com>; Jonathan Corbet <corbet@lwn.net>; Jens > Axboe <axboe@kernel.dk>; Keith Busch <kbusch@kernel.org>; Sagi > Grimberg <sagi@grimberg.me>; Yishai Hadas <yishaih@nvidia.com>; > Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>; Tian, Kevin > <kevin.tian@intel.com>; Alex Williamson <alex.williamson@redhat.com>; > Jérôme Glisse <jglisse@redhat.com>; Andrew Morton <akpm@linux- > foundation.org>; linux-doc@vger.kernel.org; linux-kernel@vger.kernel.org; > linux-block@vger.kernel.org; linux-rdma@vger.kernel.org; > iommu@lists.linux.dev; linux-nvme@lists.infradead.org; > kvm@vger.kernel.org; linux-mm@kvack.org; Bart Van Assche > <bvanassche@acm.org>; Damien Le Moal > <damien.lemoal@opensource.wdc.com>; Amir Goldstein > <amir73il@gmail.com>; josef@toxicpanda.com; Martin K. Petersen > <martin.petersen@oracle.com>; daniel@iogearbox.net; Williams, Dan J > <dan.j.williams@intel.com>; jack@suse.com; Zhu Yanjun > <zyjzyj2000@gmail.com>; Bommu, Krishnaiah > <krishnaiah.bommu@intel.com>; Ghimiray, Himal Prasad > <himal.prasad.ghimiray@intel.com> > Subject: Re: [RFC RESEND 00/16] Split IOMMU DMA mapping operation to > two steps > > On Mon, Jun 10, 2024 at 04:40:19PM +0000, Zeng, Oak wrote: > > Thanks Leon and Yanjun for the reply! > > > > Based on the reply, we will continue use the current version for > > test (as it is tested for vfio and rdma). We will switch to v1 once > > it is fully tested/reviewed. > > I'm glad you are finding it useful, one of my interests with this work > is to improve all the HMM users. > > Jason
On Mon, Jun 10, 2024 at 09:28:04PM +0000, Zeng, Oak wrote: > Hi Jason, Leon, > > I was able to fix the issue from my side. Things work fine now. I got two questions though: > > 1) The value returned from dma_link_range function is not contiguous, see below print. The "linked pa" is the function return. > I think dma_map_sgtable API would return some contiguous dma address. Is the dma-map_sgtable api is more efficient regarding the iommu page table? i.e., try to use bigger page size, such as use 2M page size when it is possible. With your new API, does it also have such consideration? I vaguely remembered Jason mentioned such thing, but my print below doesn't look like so. Maybe I need to test bigger range (only 16 pages range in the test of below printing). Comment? My API gives you the flexibility to use any page size you want. You can use 2M pages instead of 4K pages. The API doesn't enforce any page size. > > [17584.665126] drm_svm_hmmptr_map_dma_pages iova.dma_addr = 0x0, linked pa = 18ef3f000 > [17584.665146] drm_svm_hmmptr_map_dma_pages iova.dma_addr = 0x0, linked pa = 190d00000 > [17584.665150] drm_svm_hmmptr_map_dma_pages iova.dma_addr = 0x0, linked pa = 190024000 > [17584.665153] drm_svm_hmmptr_map_dma_pages iova.dma_addr = 0x0, linked pa = 178e89000 > > 2) in the comment of dma_link_range function, it is said: " @dma_offset needs to be advanced by the caller with the size of previous page that was linked + DMA address returned for the previous page". > Is this description correct? I don't understand the part "+ DMA address returned for the previous page ". > In my codes, let's say I call this function to link 10 pages, the first dma_offset is 0, second is 4k, third 8k. This worked for me. I didn't add the previously returned dma address. > Maybe I need more test. But any comment? You did it perfectly right. This is the correct way to advance dma_offset. Thanks > > Thanks, > Oak > > > -----Original Message----- > > From: Jason Gunthorpe <jgg@ziepe.ca> > > Sent: Monday, June 10, 2024 1:25 PM > > To: Zeng, Oak <oak.zeng@intel.com> > > Cc: Leon Romanovsky <leon@kernel.org>; Christoph Hellwig <hch@lst.de>; > > Robin Murphy <robin.murphy@arm.com>; Marek Szyprowski > > <m.szyprowski@samsung.com>; Joerg Roedel <joro@8bytes.org>; Will > > Deacon <will@kernel.org>; Chaitanya Kulkarni <chaitanyak@nvidia.com>; > > Brost, Matthew <matthew.brost@intel.com>; Hellstrom, Thomas > > <thomas.hellstrom@intel.com>; Jonathan Corbet <corbet@lwn.net>; Jens > > Axboe <axboe@kernel.dk>; Keith Busch <kbusch@kernel.org>; Sagi > > Grimberg <sagi@grimberg.me>; Yishai Hadas <yishaih@nvidia.com>; > > Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>; Tian, Kevin > > <kevin.tian@intel.com>; Alex Williamson <alex.williamson@redhat.com>; > > Jérôme Glisse <jglisse@redhat.com>; Andrew Morton <akpm@linux- > > foundation.org>; linux-doc@vger.kernel.org; linux-kernel@vger.kernel.org; > > linux-block@vger.kernel.org; linux-rdma@vger.kernel.org; > > iommu@lists.linux.dev; linux-nvme@lists.infradead.org; > > kvm@vger.kernel.org; linux-mm@kvack.org; Bart Van Assche > > <bvanassche@acm.org>; Damien Le Moal > > <damien.lemoal@opensource.wdc.com>; Amir Goldstein > > <amir73il@gmail.com>; josef@toxicpanda.com; Martin K. Petersen > > <martin.petersen@oracle.com>; daniel@iogearbox.net; Williams, Dan J > > <dan.j.williams@intel.com>; jack@suse.com; Zhu Yanjun > > <zyjzyj2000@gmail.com>; Bommu, Krishnaiah > > <krishnaiah.bommu@intel.com>; Ghimiray, Himal Prasad > > <himal.prasad.ghimiray@intel.com> > > Subject: Re: [RFC RESEND 00/16] Split IOMMU DMA mapping operation to > > two steps > > > > On Mon, Jun 10, 2024 at 04:40:19PM +0000, Zeng, Oak wrote: > > > Thanks Leon and Yanjun for the reply! > > > > > > Based on the reply, we will continue use the current version for > > > test (as it is tested for vfio and rdma). We will switch to v1 once > > > it is fully tested/reviewed. > > > > I'm glad you are finding it useful, one of my interests with this work > > is to improve all the HMM users. > > > > Jason
Thank you Leon. That is helpful. I also have another very naïve question. I don't understand what is the iova address. I previously thought the iova address space is the same as the dma_address space when iommu is involved. I thought the dma_alloc_iova would allocate some contiguous iova address range and later dma_link_range function would link a physical page to the iova address and return the iova address. In other words, I thought the dma_address is iova address, and the iommu page table translate a dma_address or iova address to the physical address. But from my print below, my above understanding is obviously wrong: the iova.dma_addr is 0 and the dma_address returned from dma_link_range is none zero... Can you help me what is iova address? Is iova address iommu related? Since dma_link_range returns a non-iova address, does this function allocate the dma-address itself? Is dma-address correlated with iova address? Oak > -----Original Message----- > From: Leon Romanovsky <leon@kernel.org> > Sent: Tuesday, June 11, 2024 11:45 AM > To: Zeng, Oak <oak.zeng@intel.com> > Cc: Jason Gunthorpe <jgg@ziepe.ca>; Christoph Hellwig <hch@lst.de>; Robin > Murphy <robin.murphy@arm.com>; Marek Szyprowski > <m.szyprowski@samsung.com>; Joerg Roedel <joro@8bytes.org>; Will > Deacon <will@kernel.org>; Chaitanya Kulkarni <chaitanyak@nvidia.com>; > Brost, Matthew <matthew.brost@intel.com>; Hellstrom, Thomas > <thomas.hellstrom@intel.com>; Jonathan Corbet <corbet@lwn.net>; Jens > Axboe <axboe@kernel.dk>; Keith Busch <kbusch@kernel.org>; Sagi > Grimberg <sagi@grimberg.me>; Yishai Hadas <yishaih@nvidia.com>; > Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>; Tian, Kevin > <kevin.tian@intel.com>; Alex Williamson <alex.williamson@redhat.com>; > Jérôme Glisse <jglisse@redhat.com>; Andrew Morton <akpm@linux- > foundation.org>; linux-doc@vger.kernel.org; linux-kernel@vger.kernel.org; > linux-block@vger.kernel.org; linux-rdma@vger.kernel.org; > iommu@lists.linux.dev; linux-nvme@lists.infradead.org; > kvm@vger.kernel.org; linux-mm@kvack.org; Bart Van Assche > <bvanassche@acm.org>; Damien Le Moal > <damien.lemoal@opensource.wdc.com>; Amir Goldstein > <amir73il@gmail.com>; josef@toxicpanda.com; Martin K. Petersen > <martin.petersen@oracle.com>; daniel@iogearbox.net; Williams, Dan J > <dan.j.williams@intel.com>; jack@suse.com; Zhu Yanjun > <zyjzyj2000@gmail.com>; Bommu, Krishnaiah > <krishnaiah.bommu@intel.com>; Ghimiray, Himal Prasad > <himal.prasad.ghimiray@intel.com> > Subject: Re: [RFC RESEND 00/16] Split IOMMU DMA mapping operation to > two steps > > On Mon, Jun 10, 2024 at 09:28:04PM +0000, Zeng, Oak wrote: > > Hi Jason, Leon, > > > > I was able to fix the issue from my side. Things work fine now. I got two > questions though: > > > > 1) The value returned from dma_link_range function is not contiguous, see > below print. The "linked pa" is the function return. > > I think dma_map_sgtable API would return some contiguous dma address. > Is the dma-map_sgtable api is more efficient regarding the iommu page table? > i.e., try to use bigger page size, such as use 2M page size when it is possible. > With your new API, does it also have such consideration? I vaguely > remembered Jason mentioned such thing, but my print below doesn't look > like so. Maybe I need to test bigger range (only 16 pages range in the test of > below printing). Comment? > > My API gives you the flexibility to use any page size you want. You can > use 2M pages instead of 4K pages. The API doesn't enforce any page size. > > > > > [17584.665126] drm_svm_hmmptr_map_dma_pages iova.dma_addr = 0x0, > linked pa = 18ef3f000 > > [17584.665146] drm_svm_hmmptr_map_dma_pages iova.dma_addr = 0x0, > linked pa = 190d00000 > > [17584.665150] drm_svm_hmmptr_map_dma_pages iova.dma_addr = 0x0, > linked pa = 190024000 > > [17584.665153] drm_svm_hmmptr_map_dma_pages iova.dma_addr = 0x0, > linked pa = 178e89000 > > > > 2) in the comment of dma_link_range function, it is said: " @dma_offset > needs to be advanced by the caller with the size of previous page that was > linked + DMA address returned for the previous page". > > Is this description correct? I don't understand the part "+ DMA address > returned for the previous page ". > > In my codes, let's say I call this function to link 10 pages, the first > dma_offset is 0, second is 4k, third 8k. This worked for me. I didn't add the > previously returned dma address. > > Maybe I need more test. But any comment? > > You did it perfectly right. This is the correct way to advance dma_offset. > > Thanks > > > > > Thanks, > > Oak > > > > > -----Original Message----- > > > From: Jason Gunthorpe <jgg@ziepe.ca> > > > Sent: Monday, June 10, 2024 1:25 PM > > > To: Zeng, Oak <oak.zeng@intel.com> > > > Cc: Leon Romanovsky <leon@kernel.org>; Christoph Hellwig > <hch@lst.de>; > > > Robin Murphy <robin.murphy@arm.com>; Marek Szyprowski > > > <m.szyprowski@samsung.com>; Joerg Roedel <joro@8bytes.org>; Will > > > Deacon <will@kernel.org>; Chaitanya Kulkarni <chaitanyak@nvidia.com>; > > > Brost, Matthew <matthew.brost@intel.com>; Hellstrom, Thomas > > > <thomas.hellstrom@intel.com>; Jonathan Corbet <corbet@lwn.net>; > Jens > > > Axboe <axboe@kernel.dk>; Keith Busch <kbusch@kernel.org>; Sagi > > > Grimberg <sagi@grimberg.me>; Yishai Hadas <yishaih@nvidia.com>; > > > Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>; Tian, > Kevin > > > <kevin.tian@intel.com>; Alex Williamson <alex.williamson@redhat.com>; > > > Jérôme Glisse <jglisse@redhat.com>; Andrew Morton <akpm@linux- > > > foundation.org>; linux-doc@vger.kernel.org; linux- > kernel@vger.kernel.org; > > > linux-block@vger.kernel.org; linux-rdma@vger.kernel.org; > > > iommu@lists.linux.dev; linux-nvme@lists.infradead.org; > > > kvm@vger.kernel.org; linux-mm@kvack.org; Bart Van Assche > > > <bvanassche@acm.org>; Damien Le Moal > > > <damien.lemoal@opensource.wdc.com>; Amir Goldstein > > > <amir73il@gmail.com>; josef@toxicpanda.com; Martin K. Petersen > > > <martin.petersen@oracle.com>; daniel@iogearbox.net; Williams, Dan J > > > <dan.j.williams@intel.com>; jack@suse.com; Zhu Yanjun > > > <zyjzyj2000@gmail.com>; Bommu, Krishnaiah > > > <krishnaiah.bommu@intel.com>; Ghimiray, Himal Prasad > > > <himal.prasad.ghimiray@intel.com> > > > Subject: Re: [RFC RESEND 00/16] Split IOMMU DMA mapping operation to > > > two steps > > > > > > On Mon, Jun 10, 2024 at 04:40:19PM +0000, Zeng, Oak wrote: > > > > Thanks Leon and Yanjun for the reply! > > > > > > > > Based on the reply, we will continue use the current version for > > > > test (as it is tested for vfio and rdma). We will switch to v1 once > > > > it is fully tested/reviewed. > > > > > > I'm glad you are finding it useful, one of my interests with this work > > > is to improve all the HMM users. > > > > > > Jason
On Tue, Jun 11, 2024 at 06:26:23PM +0000, Zeng, Oak wrote: > Thank you Leon. That is helpful. > > I also have another very naïve question. I don't understand what is the iova address. I previously thought the iova address space is the same as the dma_address space when iommu is involved. I thought the dma_alloc_iova would allocate some contiguous iova address range and later dma_link_range function would link a physical page to the iova address and return the iova address. In other words, I thought the dma_address is iova address, and the iommu page table translate a dma_address or iova address to the physical address. This is right understanding. > > But from my print below, my above understanding is obviously wrong: the iova.dma_addr is 0 and the dma_address returned from dma_link_range is none zero... Can you help me what is iova address? Is iova address iommu related? Since dma_link_range returns a non-iova address, does this function allocate the dma-address itself? Is dma-address correlated with iova address? This is a combination of two things: 1. Need to support HMM specific logic 2. Outcome of v0 version where I implemented dma_link_range() to perform fallback to DMA direct mode. See patch 2 and 3. https://lore.kernel.org/all/54a3554639bfb963c9919c5d7c1f449021bebdb3.1709635535.git.leon@kernel.org/ https://lore.kernel.org/all/f1049f0fc280288ae2f0c1e02388cde91b0f7876.1709635535.git.leon@kernel.org/ So dma-iova == 0 means that you are working in direct mode and not with IOMMU, e.g. you can translate from physical address to DMA address by simple call to phys_to_dma(). One of the comments was that it is not desired behaviour and I need to create separate functions that will be in use only when IOMMU is used. See the difference between v0 and v1 for dma_link_range() function. v0: https://lore.kernel.org/all/f1049f0fc280288ae2f0c1e02388cde91b0f7876.1709635535.git.leon@kernel.org/ v1: https://git.kernel.org/pub/scm/linux/kernel/git/leon/linux-rdma.git/commit/?h=dma-split-v1&id=5aa29f2620ef86ac58c17a0297929a0b9e8d7790 And HMM variant of dma_link_range() function, which saves from you the need to copy/paste same HMM logic from RDMA to DRM. https://git.kernel.org/pub/scm/linux/kernel/git/leon/linux-rdma.git/commit/?h=dma-split-v1&id=4d8d8d4fbe7891b1412f03ecaff88bc492e2e4eb Thanks > > Oak > > > -----Original Message----- > > From: Leon Romanovsky <leon@kernel.org> > > Sent: Tuesday, June 11, 2024 11:45 AM > > To: Zeng, Oak <oak.zeng@intel.com> > > Cc: Jason Gunthorpe <jgg@ziepe.ca>; Christoph Hellwig <hch@lst.de>; Robin > > Murphy <robin.murphy@arm.com>; Marek Szyprowski > > <m.szyprowski@samsung.com>; Joerg Roedel <joro@8bytes.org>; Will > > Deacon <will@kernel.org>; Chaitanya Kulkarni <chaitanyak@nvidia.com>; > > Brost, Matthew <matthew.brost@intel.com>; Hellstrom, Thomas > > <thomas.hellstrom@intel.com>; Jonathan Corbet <corbet@lwn.net>; Jens > > Axboe <axboe@kernel.dk>; Keith Busch <kbusch@kernel.org>; Sagi > > Grimberg <sagi@grimberg.me>; Yishai Hadas <yishaih@nvidia.com>; > > Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>; Tian, Kevin > > <kevin.tian@intel.com>; Alex Williamson <alex.williamson@redhat.com>; > > Jérôme Glisse <jglisse@redhat.com>; Andrew Morton <akpm@linux- > > foundation.org>; linux-doc@vger.kernel.org; linux-kernel@vger.kernel.org; > > linux-block@vger.kernel.org; linux-rdma@vger.kernel.org; > > iommu@lists.linux.dev; linux-nvme@lists.infradead.org; > > kvm@vger.kernel.org; linux-mm@kvack.org; Bart Van Assche > > <bvanassche@acm.org>; Damien Le Moal > > <damien.lemoal@opensource.wdc.com>; Amir Goldstein > > <amir73il@gmail.com>; josef@toxicpanda.com; Martin K. Petersen > > <martin.petersen@oracle.com>; daniel@iogearbox.net; Williams, Dan J > > <dan.j.williams@intel.com>; jack@suse.com; Zhu Yanjun > > <zyjzyj2000@gmail.com>; Bommu, Krishnaiah > > <krishnaiah.bommu@intel.com>; Ghimiray, Himal Prasad > > <himal.prasad.ghimiray@intel.com> > > Subject: Re: [RFC RESEND 00/16] Split IOMMU DMA mapping operation to > > two steps > > > > On Mon, Jun 10, 2024 at 09:28:04PM +0000, Zeng, Oak wrote: > > > Hi Jason, Leon, > > > > > > I was able to fix the issue from my side. Things work fine now. I got two > > questions though: > > > > > > 1) The value returned from dma_link_range function is not contiguous, see > > below print. The "linked pa" is the function return. > > > I think dma_map_sgtable API would return some contiguous dma address. > > Is the dma-map_sgtable api is more efficient regarding the iommu page table? > > i.e., try to use bigger page size, such as use 2M page size when it is possible. > > With your new API, does it also have such consideration? I vaguely > > remembered Jason mentioned such thing, but my print below doesn't look > > like so. Maybe I need to test bigger range (only 16 pages range in the test of > > below printing). Comment? > > > > My API gives you the flexibility to use any page size you want. You can > > use 2M pages instead of 4K pages. The API doesn't enforce any page size. > > > > > > > > [17584.665126] drm_svm_hmmptr_map_dma_pages iova.dma_addr = 0x0, > > linked pa = 18ef3f000 > > > [17584.665146] drm_svm_hmmptr_map_dma_pages iova.dma_addr = 0x0, > > linked pa = 190d00000 > > > [17584.665150] drm_svm_hmmptr_map_dma_pages iova.dma_addr = 0x0, > > linked pa = 190024000 > > > [17584.665153] drm_svm_hmmptr_map_dma_pages iova.dma_addr = 0x0, > > linked pa = 178e89000 > > > > > > 2) in the comment of dma_link_range function, it is said: " @dma_offset > > needs to be advanced by the caller with the size of previous page that was > > linked + DMA address returned for the previous page". > > > Is this description correct? I don't understand the part "+ DMA address > > returned for the previous page ". > > > In my codes, let's say I call this function to link 10 pages, the first > > dma_offset is 0, second is 4k, third 8k. This worked for me. I didn't add the > > previously returned dma address. > > > Maybe I need more test. But any comment? > > > > You did it perfectly right. This is the correct way to advance dma_offset. > > > > Thanks > > > > > > > > Thanks, > > > Oak > > > > > > > -----Original Message----- > > > > From: Jason Gunthorpe <jgg@ziepe.ca> > > > > Sent: Monday, June 10, 2024 1:25 PM > > > > To: Zeng, Oak <oak.zeng@intel.com> > > > > Cc: Leon Romanovsky <leon@kernel.org>; Christoph Hellwig > > <hch@lst.de>; > > > > Robin Murphy <robin.murphy@arm.com>; Marek Szyprowski > > > > <m.szyprowski@samsung.com>; Joerg Roedel <joro@8bytes.org>; Will > > > > Deacon <will@kernel.org>; Chaitanya Kulkarni <chaitanyak@nvidia.com>; > > > > Brost, Matthew <matthew.brost@intel.com>; Hellstrom, Thomas > > > > <thomas.hellstrom@intel.com>; Jonathan Corbet <corbet@lwn.net>; > > Jens > > > > Axboe <axboe@kernel.dk>; Keith Busch <kbusch@kernel.org>; Sagi > > > > Grimberg <sagi@grimberg.me>; Yishai Hadas <yishaih@nvidia.com>; > > > > Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>; Tian, > > Kevin > > > > <kevin.tian@intel.com>; Alex Williamson <alex.williamson@redhat.com>; > > > > Jérôme Glisse <jglisse@redhat.com>; Andrew Morton <akpm@linux- > > > > foundation.org>; linux-doc@vger.kernel.org; linux- > > kernel@vger.kernel.org; > > > > linux-block@vger.kernel.org; linux-rdma@vger.kernel.org; > > > > iommu@lists.linux.dev; linux-nvme@lists.infradead.org; > > > > kvm@vger.kernel.org; linux-mm@kvack.org; Bart Van Assche > > > > <bvanassche@acm.org>; Damien Le Moal > > > > <damien.lemoal@opensource.wdc.com>; Amir Goldstein > > > > <amir73il@gmail.com>; josef@toxicpanda.com; Martin K. Petersen > > > > <martin.petersen@oracle.com>; daniel@iogearbox.net; Williams, Dan J > > > > <dan.j.williams@intel.com>; jack@suse.com; Zhu Yanjun > > > > <zyjzyj2000@gmail.com>; Bommu, Krishnaiah > > > > <krishnaiah.bommu@intel.com>; Ghimiray, Himal Prasad > > > > <himal.prasad.ghimiray@intel.com> > > > > Subject: Re: [RFC RESEND 00/16] Split IOMMU DMA mapping operation to > > > > two steps > > > > > > > > On Mon, Jun 10, 2024 at 04:40:19PM +0000, Zeng, Oak wrote: > > > > > Thanks Leon and Yanjun for the reply! > > > > > > > > > > Based on the reply, we will continue use the current version for > > > > > test (as it is tested for vfio and rdma). We will switch to v1 once > > > > > it is fully tested/reviewed. > > > > > > > > I'm glad you are finding it useful, one of my interests with this work > > > > is to improve all the HMM users. > > > > > > > > Jason >
On 10.06.24 23:28, Zeng, Oak wrote: > Hi Jason, Leon, > > I was able to fix the issue from my side. Things work fine now. Can you enlarge the dma list, then make tests with fio? Not sure if the performance is better or not. Thanks, Zhu Yanjun > I got two questions though: > > 1) The value returned from dma_link_range function is not contiguous, see below print. The "linked pa" is the function return. > I think dma_map_sgtable API would return some contiguous dma address. Is the dma-map_sgtable api is more efficient regarding the iommu page table? i.e., try to use bigger page size, such as use 2M page size when it is possible. With your new API, does it also have such consideration? I vaguely remembered Jason mentioned such thing, but my print below doesn't look like so. Maybe I need to test bigger range (only 16 pages range in the test of below printing). Comment? > > [17584.665126] drm_svm_hmmptr_map_dma_pages iova.dma_addr = 0x0, linked pa = 18ef3f000 > [17584.665146] drm_svm_hmmptr_map_dma_pages iova.dma_addr = 0x0, linked pa = 190d00000 > [17584.665150] drm_svm_hmmptr_map_dma_pages iova.dma_addr = 0x0, linked pa = 190024000 > [17584.665153] drm_svm_hmmptr_map_dma_pages iova.dma_addr = 0x0, linked pa = 178e89000 > > 2) in the comment of dma_link_range function, it is said: " @dma_offset needs to be advanced by the caller with the size of previous page that was linked + DMA address returned for the previous page". > Is this description correct? I don't understand the part "+ DMA address returned for the previous page ". > In my codes, let's say I call this function to link 10 pages, the first dma_offset is 0, second is 4k, third 8k. This worked for me. I didn't add the previously returned dma address. > Maybe I need more test. But any comment? > > Thanks, > Oak > >> -----Original Message----- >> From: Jason Gunthorpe <jgg@ziepe.ca> >> Sent: Monday, June 10, 2024 1:25 PM >> To: Zeng, Oak <oak.zeng@intel.com> >> Cc: Leon Romanovsky <leon@kernel.org>; Christoph Hellwig <hch@lst.de>; >> Robin Murphy <robin.murphy@arm.com>; Marek Szyprowski >> <m.szyprowski@samsung.com>; Joerg Roedel <joro@8bytes.org>; Will >> Deacon <will@kernel.org>; Chaitanya Kulkarni <chaitanyak@nvidia.com>; >> Brost, Matthew <matthew.brost@intel.com>; Hellstrom, Thomas >> <thomas.hellstrom@intel.com>; Jonathan Corbet <corbet@lwn.net>; Jens >> Axboe <axboe@kernel.dk>; Keith Busch <kbusch@kernel.org>; Sagi >> Grimberg <sagi@grimberg.me>; Yishai Hadas <yishaih@nvidia.com>; >> Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>; Tian, Kevin >> <kevin.tian@intel.com>; Alex Williamson <alex.williamson@redhat.com>; >> Jérôme Glisse <jglisse@redhat.com>; Andrew Morton <akpm@linux- >> foundation.org>; linux-doc@vger.kernel.org; linux-kernel@vger.kernel.org; >> linux-block@vger.kernel.org; linux-rdma@vger.kernel.org; >> iommu@lists.linux.dev; linux-nvme@lists.infradead.org; >> kvm@vger.kernel.org; linux-mm@kvack.org; Bart Van Assche >> <bvanassche@acm.org>; Damien Le Moal >> <damien.lemoal@opensource.wdc.com>; Amir Goldstein >> <amir73il@gmail.com>; josef@toxicpanda.com; Martin K. Petersen >> <martin.petersen@oracle.com>; daniel@iogearbox.net; Williams, Dan J >> <dan.j.williams@intel.com>; jack@suse.com; Zhu Yanjun >> <zyjzyj2000@gmail.com>; Bommu, Krishnaiah >> <krishnaiah.bommu@intel.com>; Ghimiray, Himal Prasad >> <himal.prasad.ghimiray@intel.com> >> Subject: Re: [RFC RESEND 00/16] Split IOMMU DMA mapping operation to >> two steps >> >> On Mon, Jun 10, 2024 at 04:40:19PM +0000, Zeng, Oak wrote: >>> Thanks Leon and Yanjun for the reply! >>> >>> Based on the reply, we will continue use the current version for >>> test (as it is tested for vfio and rdma). We will switch to v1 once >>> it is fully tested/reviewed. >> I'm glad you are finding it useful, one of my interests with this work >> is to improve all the HMM users. >> >> Jason -- Best
© 2016 - 2026 Red Hat, Inc.