[PATCH v11 0/9] Userspace P2PDMA with O_DIRECT NVMe devices

Logan Gunthorpe posted 9 patches 3 years, 5 months ago
Documentation/ABI/testing/sysfs-bus-pci |  10 ++
block/bio.c                             |  11 ++-
block/blk-map.c                         |  12 ++-
drivers/pci/p2pdma.c                    | 124 ++++++++++++++++++++++++
include/linux/mm.h                      |   3 +-
include/linux/mmzone.h                  |  24 +++++
include/linux/uio.h                     |   6 ++
lib/iov_iter.c                          |  32 ++++--
lib/scatterlist.c                       |  25 +++--
mm/gup.c                                |  45 ++++++---
mm/huge_memory.c                        |  19 ++--
mm/hugetlb.c                            |  23 +++--
12 files changed, 280 insertions(+), 54 deletions(-)
[PATCH v11 0/9] Userspace P2PDMA with O_DIRECT NVMe devices
Posted by Logan Gunthorpe 3 years, 5 months ago
Hi,

This is the latest P2PDMA userspace patch set. This version includes
some cleanup from feedback from the last posting[1].

This patch set enables userspace P2PDMA by allowing userspace to mmap()
allocated chunks of the CMB. The resulting VMA can be passed only
to O_DIRECT IO on NVMe backed files or block devices. A flag is added
to GUP() in Patch 1, then Patches 2 through 6 wire this flag up based
on whether the block queue indicates P2PDMA support. Patches 7
creates the sysfs resource that can hand out the VMAs and Patch 8
adds brief documentation for the new interface.

Feedback welcome.

This series is based on v6.1-rc1. A git branch is available here:

  https://github.com/sbates130272/linux-p2pmem/  p2pdma_user_cmb_v11

Thanks,

Logan

[1] https://lkml.kernel.org/r/20220922163926.7077-1-logang@deltatee.com

--

Changes in v11:
  - Rebased onto v6.1-rc1, fixed minor conflict in bio_map_user_iov
  - The GUP test was moved to try_grab_page() and try_grab_folio().
    This ought to be a bit more future proof. It required adding a new
    cleanup patch to return a proper error code from try_grab_page().
    (Per Jason)

Changes in v10:
  - Rebased onto v6.0-rc6
  - Reworked iov iter changes to reuse the code better and
    name them without the _flags() prefix (per Christoph)
  - Renamed a number of flags variables to gup_flags (per John)
  - Minor fixups to the last documentation patch (from Greg and John)

Changes in v9:
  - Rebased onto v6.0-rc2, included reworking the iov_iter patch
    due to changes there
  - Drop the char device mmap implementation in favour of a sysfs
    based interface. (per Christoph)

 (v8 only included the first half of the series and was merged for v6.0)

Changes in v8:
  - Rebase onto v5.19-rc1
  - Rework how the pages are stored in the VMA per Jason's suggestion

Changes in v7:
  - Rebased onto v5.18-rc1 which includes Christophs cleanup to
    free_zone_device_page() (similar to Ralph's patch).
  - Fix bug with concurrent first calls to pci_p2pdma_vma_fault()
    that caused a double allocation and lost p2p memory. Noticed
    by Andrew Maier.
  - Collected a Reviewed-by tag from Chaitanya.
  - Numerous minor fixes to commit messages

--

Logan Gunthorpe (9):
  mm: allow multiple error returns in try_grab_page()
  mm: introduce FOLL_PCI_P2PDMA to gate getting PCI P2PDMA pages
  iov_iter: introduce iov_iter_get_pages_[alloc_]flags()
  block: add check when merging zone device pages
  lib/scatterlist: add check when merging zone device pages
  block: set FOLL_PCI_P2PDMA in __bio_iov_iter_get_pages()
  block: set FOLL_PCI_P2PDMA in bio_map_user_iov()
  PCI/P2PDMA: Allow userspace VMA allocations through sysfs
  ABI: sysfs-bus-pci: add documentation for p2pmem allocate

 Documentation/ABI/testing/sysfs-bus-pci |  10 ++
 block/bio.c                             |  11 ++-
 block/blk-map.c                         |  12 ++-
 drivers/pci/p2pdma.c                    | 124 ++++++++++++++++++++++++
 include/linux/mm.h                      |   3 +-
 include/linux/mmzone.h                  |  24 +++++
 include/linux/uio.h                     |   6 ++
 lib/iov_iter.c                          |  32 ++++--
 lib/scatterlist.c                       |  25 +++--
 mm/gup.c                                |  45 ++++++---
 mm/huge_memory.c                        |  19 ++--
 mm/hugetlb.c                            |  23 +++--
 12 files changed, 280 insertions(+), 54 deletions(-)


base-commit: 9abf2313adc1ca1b6180c508c25f22f9395cc780
--
2.30.2
Re: [PATCH v11 0/9] Userspace P2PDMA with O_DIRECT NVMe devices
Posted by Christoph Hellwig 3 years, 5 months ago
The series looks good to me know. How do we want to handle it?  I think
we need a special branch somewhere (maybe in the block or mm trees?)
so that we can base the other iov_iter work from John on it.  Also
Al has a whole bunch of iov_iter changes that we probably want on
the same branch as well, although some of those (READ vs WRITE fixups)
look like 6.1 material to me.
Re: [PATCH v11 0/9] Userspace P2PDMA with O_DIRECT NVMe devices
Posted by John Hubbard 3 years, 5 months ago
On 10/24/22 08:03, Christoph Hellwig wrote:
> The series looks good to me know. How do we want to handle it?  I think
> we need a special branch somewhere (maybe in the block or mm trees?)
> so that we can base the other iov_iter work from John on it.  Also
> Al has a whole bunch of iov_iter changes that we probably want on
> the same branch as well, although some of those (READ vs WRITE fixups)
> look like 6.1 material to me.
> 

A little earlier, Jens graciously offered [1] to provide a topic branch,
such as:

     for-6.2/block-gup [2]

(I've moved the name forward from 6.1 to 6.2, because that discussion
was 7 weeks ago.)


[1] https://lore.kernel.org/ae675a01-90e6-4af1-6c43-660b3a6c7b72@kernel.dk
[2] https://lore.kernel.org/55a2d67f-9a12-9fe6-d73b-8c3f5eb36f31@kernel.dk

thanks,
-- 
John Hubbard
NVIDIA
Re: [PATCH v11 0/9] Userspace P2PDMA with O_DIRECT NVMe devices
Posted by Christoph Hellwig 3 years, 4 months ago
On Mon, Oct 24, 2022 at 12:15:56PM -0700, John Hubbard wrote:
> A little earlier, Jens graciously offered [1] to provide a topic branch,
> such as:
>
>     for-6.2/block-gup [2]
>
> (I've moved the name forward from 6.1 to 6.2, because that discussion
> was 7 weeks ago.)

So what are we going to do with this series?  It would be sad to miss
the merge window again.
Re: [PATCH v11 0/9] Userspace P2PDMA with O_DIRECT NVMe devices
Posted by Logan Gunthorpe 3 years, 4 months ago
@add Jens

On 2022-11-07 23:56, Christoph Hellwig wrote:
> On Mon, Oct 24, 2022 at 12:15:56PM -0700, John Hubbard wrote:
>> A little earlier, Jens graciously offered [1] to provide a topic branch,
>> such as:
>>
>>     for-6.2/block-gup [2]
>>
>> (I've moved the name forward from 6.1 to 6.2, because that discussion
>> was 7 weeks ago.)
> 
> So what are we going to do with this series?  It would be sad to miss
> the merge window again.

I noticed Jens wasn't copied on this series. I've added him. It would be
nice to get this in someone's tree soon.

Thanks!

Logan
Re: [PATCH v11 0/9] Userspace P2PDMA with O_DIRECT NVMe devices
Posted by Jens Axboe 3 years, 4 months ago
On 11/9/22 10:28 AM, Logan Gunthorpe wrote:
> @add Jens
> 
> On 2022-11-07 23:56, Christoph Hellwig wrote:
>> On Mon, Oct 24, 2022 at 12:15:56PM -0700, John Hubbard wrote:
>>> A little earlier, Jens graciously offered [1] to provide a topic branch,
>>> such as:
>>>
>>>     for-6.2/block-gup [2]
>>>
>>> (I've moved the name forward from 6.1 to 6.2, because that discussion
>>> was 7 weeks ago.)
>>
>> So what are we going to do with this series?  It would be sad to miss
>> the merge window again.
> 
> I noticed Jens wasn't copied on this series. I've added him. It would be
> nice to get this in someone's tree soon.

I took a look and the series looks fine to me.

-- 
Jens Axboe