> From: Nicolin Chen <nicolinc@nvidia.com> > Sent: Friday, April 21, 2023 3:56 PM > > On Fri, Apr 21, 2023 at 07:47:13AM +0000, Tian, Kevin wrote: > > > > It is in the commit message of the cover-letter though: > > > > https://github.com/nicolinc/iommufd/commit/5e17d270bfca2a5e3e7401d4b > > > f58ae53eb7a8a55 > > > -------------------------------------------------------- > > > Changelog > > > v2: > > > * Integrated the uAPI into VFIO_DEVICE_BIND_IOMMUFD call > > > * Renamed the previous set_rid_user to set_dev_data, to decouple from > > > the PCI regime. > > > v1: > > > https://lore.kernel.org/all/cover.1680762112.git.nicolinc@nvidia.com/ > > > -------------------------------------------------------- > > > > > > > Could you add some words why changing from passing the information > > > > in an iommufd ioctl to bind_iommufd? My gut-feeling leans toward > > > > the latter option... > > > > > > Yea. Jason told me to decouple it from PCI. And merge it into > > > a general uAPI. So I picked the BIND ioctl. > > > > > > > 'decouple it from PCI' is kind of covered by renaming set_rid > > to set_data. but I didn't get why this has to be merged with another > > uAPI. Once iommufd_device is created we could have separate > > ioctls to poke its attributes individually. What'd be broken if this > > is not done at BIND time? > > Oh, sorry. He didn't literally told me to merge, but commented > "make sense" at my proposal of reusing BIND. So, I don't think > adding to the BIND is a must here. > > The BIND is done in vfio_realize() where the RID (dev_data) is > available also. And the new uAPI in my v1 actually gets called > near the BIND. So, I feel we may just do it once? I am open to > a better idea. > IMHO if this can be done within iommufd then that should be the choice. vfio doesn't need to know this data at all and doing so means vdpa or a 3rd driver also needs to implement similar logic in their uAPI...
On Fri, Apr 21, 2023 at 08:07:19AM +0000, Tian, Kevin wrote:
> External email: Use caution opening links or attachments
>
>
> > From: Nicolin Chen <nicolinc@nvidia.com>
> > Sent: Friday, April 21, 2023 3:56 PM
> >
> > On Fri, Apr 21, 2023 at 07:47:13AM +0000, Tian, Kevin wrote:
> >
> > > > It is in the commit message of the cover-letter though:
> > > >
> > https://github.com/nicolinc/iommufd/commit/5e17d270bfca2a5e3e7401d4b
> > > > f58ae53eb7a8a55
> > > > --------------------------------------------------------
> > > > Changelog
> > > > v2:
> > > > * Integrated the uAPI into VFIO_DEVICE_BIND_IOMMUFD call
> > > > * Renamed the previous set_rid_user to set_dev_data, to decouple from
> > > > the PCI regime.
> > > > v1:
> > > > https://lore.kernel.org/all/cover.1680762112.git.nicolinc@nvidia.com/
> > > > --------------------------------------------------------
> > > >
> > > > > Could you add some words why changing from passing the information
> > > > > in an iommufd ioctl to bind_iommufd? My gut-feeling leans toward
> > > > > the latter option...
> > > >
> > > > Yea. Jason told me to decouple it from PCI. And merge it into
> > > > a general uAPI. So I picked the BIND ioctl.
> > > >
> > >
> > > 'decouple it from PCI' is kind of covered by renaming set_rid
> > > to set_data. but I didn't get why this has to be merged with another
> > > uAPI. Once iommufd_device is created we could have separate
> > > ioctls to poke its attributes individually. What'd be broken if this
> > > is not done at BIND time?
> >
> > Oh, sorry. He didn't literally told me to merge, but commented
> > "make sense" at my proposal of reusing BIND. So, I don't think
> > adding to the BIND is a must here.
> >
> > The BIND is done in vfio_realize() where the RID (dev_data) is
> > available also. And the new uAPI in my v1 actually gets called
> > near the BIND. So, I feel we may just do it once? I am open to
> > a better idea.
> >
>
> IMHO if this can be done within iommufd then that should be
> the choice. vfio doesn't need to know this data at all and doing
> so means vdpa or a 3rd driver also needs to implement similar
> logic in their uAPI...
Reusing the VFIO ioctl is because the device is a VFIO device.
But doing it within iommufd could save us a lot of efforts, as
you said.
So...
+/**
+ * struct iommufd_device_set_data - ioctl(IOMMU_DEVICE_SET_DATA)
+ * @size: sizeof(struct iommufd_device_set_data)
+ * @dev_id: The device to set a device data
+ * @data_uptr: User pointer of the device user data.
+ * @data_len: Length of the device user data.
+ */
+struct iommufd_device_set_data {
+ __u32 size;
+ __u32 dev_id;
+ __aligned_u64 data_uptr;
+ __u32 data_len;
+};
+#define IOMMU_DEVICE_SET_DATA _IO(IOMMUFD_TYPE, IOMMUFD_CMD_DEVICE_SET_DATA)
+
+/**
+ * struct iommufd_device_unset_data - ioctl(IOMMU_DEVICE_UNSET_DATA)
+ * @size: sizeof(struct iommufd_device_unset_data)
+ * @dev_id: The device to unset its device data
+ */
+struct iommufd_device_unset_data {
+ __u32 size;
+ __u32 dev_id;
+};
+#define IOMMU_DEVICE_UNSET_DATA _IO(IOMMUFD_TYPE, IOMMUFD_CMD_DEVICE_UNSET_DATA)
Maybe just like this?
Thanks
Nic
On Fri, Apr 21, 2023 at 01:20:13AM -0700, Nicolin Chen wrote:
> +/**
> + * struct iommufd_device_set_data - ioctl(IOMMU_DEVICE_SET_DATA)
> + * @size: sizeof(struct iommufd_device_set_data)
> + * @dev_id: The device to set a device data
> + * @data_uptr: User pointer of the device user data.
> + * @data_len: Length of the device user data.
> + */
> +struct iommufd_device_set_data {
> + __u32 size;
> + __u32 dev_id;
> + __aligned_u64 data_uptr;
> + __u32 data_len;
> +};
> +#define IOMMU_DEVICE_SET_DATA _IO(IOMMUFD_TYPE, IOMMUFD_CMD_DEVICE_SET_DATA)
> +
> +/**
> + * struct iommufd_device_unset_data - ioctl(IOMMU_DEVICE_UNSET_DATA)
> + * @size: sizeof(struct iommufd_device_unset_data)
> + * @dev_id: The device to unset its device data
> + */
> +struct iommufd_device_unset_data {
> + __u32 size;
> + __u32 dev_id;
> +};
> +#define IOMMU_DEVICE_UNSET_DATA _IO(IOMMUFD_TYPE, IOMMUFD_CMD_DEVICE_UNSET_DATA)
>
> Maybe just like this?
How would the iommu_ops backing this work?
Jason
On Fri, Apr 21, 2023 at 10:09:35AM -0300, Jason Gunthorpe wrote:
> On Fri, Apr 21, 2023 at 01:20:13AM -0700, Nicolin Chen wrote:
>
> > +/**
> > + * struct iommufd_device_set_data - ioctl(IOMMU_DEVICE_SET_DATA)
> > + * @size: sizeof(struct iommufd_device_set_data)
> > + * @dev_id: The device to set a device data
> > + * @data_uptr: User pointer of the device user data.
> > + * @data_len: Length of the device user data.
> > + */
> > +struct iommufd_device_set_data {
> > + __u32 size;
> > + __u32 dev_id;
> > + __aligned_u64 data_uptr;
> > + __u32 data_len;
> > +};
> > +#define IOMMU_DEVICE_SET_DATA _IO(IOMMUFD_TYPE, IOMMUFD_CMD_DEVICE_SET_DATA)
> > +
> > +/**
> > + * struct iommufd_device_unset_data - ioctl(IOMMU_DEVICE_UNSET_DATA)
> > + * @size: sizeof(struct iommufd_device_unset_data)
> > + * @dev_id: The device to unset its device data
> > + */
> > +struct iommufd_device_unset_data {
> > + __u32 size;
> > + __u32 dev_id;
> > +};
> > +#define IOMMU_DEVICE_UNSET_DATA _IO(IOMMUFD_TYPE, IOMMUFD_CMD_DEVICE_UNSET_DATA)
> >
> > Maybe just like this?
>
> How would the iommu_ops backing this work?
How about the following piece? Needs a test with QEMU though..
static const size_t iommufd_device_data_size[] = {
[IOMMU_HW_INFO_TYPE_NONE] = 0,
[IOMMU_HW_INFO_TYPE_INTEL_VTD] = 0,
[IOMMU_HW_INFO_TYPE_ARM_SMMUV3] =
sizeof(struct iommu_device_data_arm_smmuv3),
};
int iommufd_device_set_data(struct iommufd_ucmd *ucmd)
{
struct iommufd_device_set_data *cmd = ucmd->cmd;
struct iommufd_device *idev;
const struct iommu_ops *ops;
void *data = NULL;
u32 klen = 0;
int rc;
if (!cmd->data_uptr || !cmd->data_len)
return -EINVAL;
idev = iommufd_get_device(ucmd, cmd->dev_id);
if (IS_ERR(idev))
return PTR_ERR(idev);
ops = dev_iommu_ops(idev->dev);
if (!ops || !ops->set_dev_data_user || !ops->unset_dev_data_user ||
ops->hw_info_type >= ARRAY_SIZE(iommufd_device_data_size)) {
rc = -EOPNOTSUPP;
goto out_put_idev;
}
klen = iommufd_device_data_size[ops->hw_info_type];
if (!klen) {
rc = -EOPNOTSUPP;
goto out_put_idev;
}
data = kzalloc(klen, GFP_KERNEL);
if (!data) {
rc = -ENOMEM;
goto out_put_idev;
}
if (copy_struct_from_user(data, klen, u64_to_user_ptr(cmd->data_uptr),
cmd->data_len)) {
rc = -EFAULT;
goto out_free_data;
}
rc = ops->set_dev_data_user(idev->dev, data);
out_free_data:
kfree(data);
out_put_idev:
iommufd_put_object(&idev->obj);
return rc;
}
On Fri, Apr 21, 2023 at 10:37:22AM -0700, Nicolin Chen wrote:
> How about the following piece? Needs a test with QEMU though..
>
> static const size_t iommufd_device_data_size[] = {
> [IOMMU_HW_INFO_TYPE_NONE] = 0,
> [IOMMU_HW_INFO_TYPE_INTEL_VTD] = 0,
> [IOMMU_HW_INFO_TYPE_ARM_SMMUV3] =
> sizeof(struct iommu_device_data_arm_smmuv3),
> };
If we need more than one of these things we'll need a better
solution..
> rc = ops->set_dev_data_user(idev->dev, data);
Where will the iommu driver store the vsid to sid xarray from these
arguments?
Jason
On Fri, Apr 21, 2023 at 02:59:37PM -0300, Jason Gunthorpe wrote:
> On Fri, Apr 21, 2023 at 10:37:22AM -0700, Nicolin Chen wrote:
>
> > How about the following piece? Needs a test with QEMU though..
> >
> > static const size_t iommufd_device_data_size[] = {
> > [IOMMU_HW_INFO_TYPE_NONE] = 0,
> > [IOMMU_HW_INFO_TYPE_INTEL_VTD] = 0,
> > [IOMMU_HW_INFO_TYPE_ARM_SMMUV3] =
> > sizeof(struct iommu_device_data_arm_smmuv3),
> > };
>
> If we need more than one of these things we'll need a better
> solution..
How about adding ops->device_data_size to store the value?
And, since we have a few size arrays in hw_pagetable.c too,
perhaps a new structure in ops packing all these sizes can
clean up a bit things too? For example,
static struct iommu_user_data_size arm_smmu_user_data_size = {
.device_data_size = sizeof(iommu_device_data_arm_smmuv3),
.hwpt_alloc_data_size = sizeof(iommu_hwpt_alloc_arm_smmuv3),
.hwpt_invalidate_data_size = sizeof(iommu_hwpt_invalidate_arm_smmuv3),
}
The hwpt_xxx_data_size might be in form of arrays for multi-
HWPT_TYPE support.
> > rc = ops->set_dev_data_user(idev->dev, data);
>
> Where will the iommu driver store the vsid to sid xarray from these
> arguments?
The ARM structure packs a vsid. For example:
static int arm_smmu_set_data(struct device *dev, const void *user_data)
{
const struct iommufd_device_data_arm_smmuv3 *data = user_data;
struct arm_smmu_master *master = dev_iommu_priv_get(dev);
struct arm_smmu_stream *stream = &master->streams[0];
struct arm_smmu_device *smmu = master->smmu;
u32 sid_user = data->sid;
int ret = 0;
if (!sid_user)
return -EINVAL;
ret = xa_alloc(&smmu->streams_user, &sid_user, stream,
XA_LIMIT(sid_user, sid_user), GFP_KERNEL_ACCOUNT);
if (ret)
return ret;
stream->id_user = sid_user;
return 0;
}
Thanks
Nic
On Fri, Apr 21, 2023 at 11:19:23AM -0700, Nicolin Chen wrote:
> On Fri, Apr 21, 2023 at 02:59:37PM -0300, Jason Gunthorpe wrote:
> > On Fri, Apr 21, 2023 at 10:37:22AM -0700, Nicolin Chen wrote:
> >
> > > How about the following piece? Needs a test with QEMU though..
> > >
> > > static const size_t iommufd_device_data_size[] = {
> > > [IOMMU_HW_INFO_TYPE_NONE] = 0,
> > > [IOMMU_HW_INFO_TYPE_INTEL_VTD] = 0,
> > > [IOMMU_HW_INFO_TYPE_ARM_SMMUV3] =
> > > sizeof(struct iommu_device_data_arm_smmuv3),
> > > };
> >
> > If we need more than one of these things we'll need a better
> > solution..
>
> How about adding ops->device_data_size to store the value?
https://lore.kernel.org/linux-iommu/cover.1682234302.git.nicolinc@nvidia.com/
I sent a v3 that includes this replacing the data_size array.
If it looks good, we can drop the other two data_size arrays
for hwpt in the nesting series too.
Thanks
Nic
© 2016 - 2025 Red Hat, Inc.