When device ownership is passed to a new process via VHOST_NEW_OWNER,
some devices need to know the new userland addresses of the dma mappings.
Define the new iotlb message type VHOST_IOTLB_REMAP to update the uaddr
of a mapping. The new uaddr must address the same memory object as
originally mapped.
The user must suspend the device before the old address is invalidated,
and cannot resume it until after VHOST_IOTLB_REMAP is called, but this
requirement is not enforced by the API.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
---
drivers/vhost/vdpa.c | 58 ++++++++++++++++++++++++++++++++
include/uapi/linux/vhost_types.h | 11 +++++-
2 files changed, 68 insertions(+), 1 deletion(-)
diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
index 4396fe1a90c4..51f71c45c4a9 100644
--- a/drivers/vhost/vdpa.c
+++ b/drivers/vhost/vdpa.c
@@ -1257,6 +1257,61 @@ static int vhost_vdpa_pa_map(struct vhost_vdpa *v,
}
+static int vhost_vdpa_process_iotlb_remap(struct vhost_vdpa *v,
+ struct vhost_iotlb *iotlb,
+ struct vhost_iotlb_msg *msg)
+{
+ struct vdpa_device *vdpa = v->vdpa;
+ const struct vdpa_config_ops *ops = vdpa->config;
+ u32 asid = iotlb_to_asid(iotlb);
+ u64 start = msg->iova;
+ u64 last = start + msg->size - 1;
+ struct vhost_iotlb_map *map;
+ int r = 0;
+
+ if (msg->perm || !msg->size)
+ return -EINVAL;
+
+ map = vhost_iotlb_itree_first(iotlb, start, last);
+ if (!map)
+ return -ENOENT;
+
+ if (map->start != start || map->last != last)
+ return -EINVAL;
+
+ /*
+ * The current implementation does not support the platform iommu
+ * with use_va. And if !use_va, remap is not necessary.
+ */
+ if (!ops->set_map && !ops->dma_map)
+ return -EINVAL;
+
+ /*
+ * The current implementation supports set_map but not dma_map.
+ */
+ if (!ops->set_map)
+ return -EINVAL;
+
+ /*
+ * Do not verify that the new size@uaddr points to the same physical
+ * pages as the old size@uaddr, because that would take time O(npages),
+ * and would increase guest down time during live update. If the app
+ * is buggy and they differ, then the app may corrupt its own memory,
+ * but no one else's.
+ */
+
+ /*
+ * Batch will finish later in BATCH_END by calling set_map for the new
+ * addresses collected here. Non-batch must do it now.
+ */
+ if (!v->in_batch)
+ r = ops->set_map(vdpa, asid, iotlb);
+ if (!r)
+ map->addr = msg->uaddr;
+
+ return r;
+}
+
static int vhost_vdpa_process_iotlb_update(struct vhost_vdpa *v,
struct vhost_iotlb *iotlb,
struct vhost_iotlb_msg *msg)
@@ -1336,6 +1391,9 @@ static int vhost_vdpa_process_iotlb_msg(struct vhost_dev *dev, u32 asid,
ops->set_map(vdpa, asid, iotlb);
v->in_batch = false;
break;
+ case VHOST_IOTLB_REMAP:
+ r = vhost_vdpa_process_iotlb_remap(v, iotlb, msg);
+ break;
default:
r = -EINVAL;
break;
diff --git a/include/uapi/linux/vhost_types.h b/include/uapi/linux/vhost_types.h
index 9177843951e9..35908315ff55 100644
--- a/include/uapi/linux/vhost_types.h
+++ b/include/uapi/linux/vhost_types.h
@@ -79,7 +79,7 @@ struct vhost_iotlb_msg {
/*
* VHOST_IOTLB_BATCH_BEGIN and VHOST_IOTLB_BATCH_END allow modifying
* multiple mappings in one go: beginning with
- * VHOST_IOTLB_BATCH_BEGIN, followed by any number of
+ * VHOST_IOTLB_BATCH_BEGIN, followed by any number of VHOST_IOTLB_REMAP or
* VHOST_IOTLB_UPDATE messages, and ending with VHOST_IOTLB_BATCH_END.
* When one of these two values is used as the message type, the rest
* of the fields in the message are ignored. There's no guarantee that
@@ -87,6 +87,15 @@ struct vhost_iotlb_msg {
*/
#define VHOST_IOTLB_BATCH_BEGIN 5
#define VHOST_IOTLB_BATCH_END 6
+
+/*
+ * VHOST_IOTLB_REMAP registers a new uaddr for the existing mapping at iova.
+ * The new uaddr must address the same memory object as originally mapped.
+ * Failure to do so will result in user memory corruption and/or device
+ * misbehavior. iova and size must match the arguments used to create the
+ * an existing mapping. Protection is not changed, and perm must be 0.
+ */
+#define VHOST_IOTLB_REMAP 7
__u8 type;
};
--
2.39.3
On Fri, Jul 12, 2024 at 9:19 PM Steve Sistare <steven.sistare@oracle.com> wrote:
>
> When device ownership is passed to a new process via VHOST_NEW_OWNER,
> some devices need to know the new userland addresses of the dma mappings.
> Define the new iotlb message type VHOST_IOTLB_REMAP to update the uaddr
> of a mapping. The new uaddr must address the same memory object as
> originally mapped.
>
> The user must suspend the device before the old address is invalidated,
> and cannot resume it until after VHOST_IOTLB_REMAP is called, but this
> requirement is not enforced by the API.
>
> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
> ---
> drivers/vhost/vdpa.c | 58 ++++++++++++++++++++++++++++++++
> include/uapi/linux/vhost_types.h | 11 +++++-
> 2 files changed, 68 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
> index 4396fe1a90c4..51f71c45c4a9 100644
> --- a/drivers/vhost/vdpa.c
> +++ b/drivers/vhost/vdpa.c
> @@ -1257,6 +1257,61 @@ static int vhost_vdpa_pa_map(struct vhost_vdpa *v,
>
> }
>
> +static int vhost_vdpa_process_iotlb_remap(struct vhost_vdpa *v,
> + struct vhost_iotlb *iotlb,
> + struct vhost_iotlb_msg *msg)
> +{
> + struct vdpa_device *vdpa = v->vdpa;
> + const struct vdpa_config_ops *ops = vdpa->config;
> + u32 asid = iotlb_to_asid(iotlb);
> + u64 start = msg->iova;
> + u64 last = start + msg->size - 1;
> + struct vhost_iotlb_map *map;
> + int r = 0;
> +
> + if (msg->perm || !msg->size)
> + return -EINVAL;
> +
> + map = vhost_iotlb_itree_first(iotlb, start, last);
> + if (!map)
> + return -ENOENT;
> +
> + if (map->start != start || map->last != last)
> + return -EINVAL;
I had a question here, if a buggy user space that:
1) forget to update some of the mappings
2) address is wrong
3) other cases.
Does this mean the device can DMA to the previous owner? If yes, does
this have security implications?
> +
> + /*
> + * The current implementation does not support the platform iommu
> + * with use_va. And if !use_va, remap is not necessary.
> + */
> + if (!ops->set_map && !ops->dma_map)
> + return -EINVAL;
> +
> + /*
> + * The current implementation supports set_map but not dma_map.
> + */
> + if (!ops->set_map)
> + return -EINVAL;
> +
> + /*
> + * Do not verify that the new size@uaddr points to the same physical
> + * pages as the old size@uaddr, because that would take time O(npages),
> + * and would increase guest down time during live update. If the app
> + * is buggy and they differ, then the app may corrupt its own memory,
> + * but no one else's.
> + */
> +
> + /*
> + * Batch will finish later in BATCH_END by calling set_map for the new
> + * addresses collected here. Non-batch must do it now.
> + */
> + if (!v->in_batch)
> + r = ops->set_map(vdpa, asid, iotlb);
> + if (!r)
> + map->addr = msg->uaddr;
> +
> + return r;
> +}
> +
> static int vhost_vdpa_process_iotlb_update(struct vhost_vdpa *v,
> struct vhost_iotlb *iotlb,
> struct vhost_iotlb_msg *msg)
> @@ -1336,6 +1391,9 @@ static int vhost_vdpa_process_iotlb_msg(struct vhost_dev *dev, u32 asid,
> ops->set_map(vdpa, asid, iotlb);
> v->in_batch = false;
> break;
> + case VHOST_IOTLB_REMAP:
> + r = vhost_vdpa_process_iotlb_remap(v, iotlb, msg);
> + break;
> default:
> r = -EINVAL;
> break;
> diff --git a/include/uapi/linux/vhost_types.h b/include/uapi/linux/vhost_types.h
> index 9177843951e9..35908315ff55 100644
> --- a/include/uapi/linux/vhost_types.h
> +++ b/include/uapi/linux/vhost_types.h
> @@ -79,7 +79,7 @@ struct vhost_iotlb_msg {
> /*
> * VHOST_IOTLB_BATCH_BEGIN and VHOST_IOTLB_BATCH_END allow modifying
> * multiple mappings in one go: beginning with
> - * VHOST_IOTLB_BATCH_BEGIN, followed by any number of
> + * VHOST_IOTLB_BATCH_BEGIN, followed by any number of VHOST_IOTLB_REMAP or
> * VHOST_IOTLB_UPDATE messages, and ending with VHOST_IOTLB_BATCH_END.
> * When one of these two values is used as the message type, the rest
> * of the fields in the message are ignored. There's no guarantee that
> @@ -87,6 +87,15 @@ struct vhost_iotlb_msg {
> */
> #define VHOST_IOTLB_BATCH_BEGIN 5
> #define VHOST_IOTLB_BATCH_END 6
> +
> +/*
> + * VHOST_IOTLB_REMAP registers a new uaddr for the existing mapping at iova.
> + * The new uaddr must address the same memory object as originally mapped.
> + * Failure to do so will result in user memory corruption and/or device
> + * misbehavior. iova and size must match the arguments used to create the
> + * an existing mapping. Protection is not changed, and perm must be 0.
> + */
> +#define VHOST_IOTLB_REMAP 7
> __u8 type;
> };
Thanks
>
> --
> 2.39.3
>
On 7/14/2024 10:34 PM, Jason Wang wrote:
> On Fri, Jul 12, 2024 at 9:19 PM Steve Sistare <steven.sistare@oracle.com> wrote:
>>
>> When device ownership is passed to a new process via VHOST_NEW_OWNER,
>> some devices need to know the new userland addresses of the dma mappings.
>> Define the new iotlb message type VHOST_IOTLB_REMAP to update the uaddr
>> of a mapping. The new uaddr must address the same memory object as
>> originally mapped.
>>
>> The user must suspend the device before the old address is invalidated,
>> and cannot resume it until after VHOST_IOTLB_REMAP is called, but this
>> requirement is not enforced by the API.
>>
>> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
>> ---
>> drivers/vhost/vdpa.c | 58 ++++++++++++++++++++++++++++++++
>> include/uapi/linux/vhost_types.h | 11 +++++-
>> 2 files changed, 68 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
>> index 4396fe1a90c4..51f71c45c4a9 100644
>> --- a/drivers/vhost/vdpa.c
>> +++ b/drivers/vhost/vdpa.c
>> @@ -1257,6 +1257,61 @@ static int vhost_vdpa_pa_map(struct vhost_vdpa *v,
>>
>> }
>>
>> +static int vhost_vdpa_process_iotlb_remap(struct vhost_vdpa *v,
>> + struct vhost_iotlb *iotlb,
>> + struct vhost_iotlb_msg *msg)
>> +{
>> + struct vdpa_device *vdpa = v->vdpa;
>> + const struct vdpa_config_ops *ops = vdpa->config;
>> + u32 asid = iotlb_to_asid(iotlb);
>> + u64 start = msg->iova;
>> + u64 last = start + msg->size - 1;
>> + struct vhost_iotlb_map *map;
>> + int r = 0;
>> +
>> + if (msg->perm || !msg->size)
>> + return -EINVAL;
>> +
>> + map = vhost_iotlb_itree_first(iotlb, start, last);
>> + if (!map)
>> + return -ENOENT;
>> +
>> + if (map->start != start || map->last != last)
>> + return -EINVAL;
>
> I had a question here, if a buggy user space that:
>
> 1) forget to update some of the mappings
> 2) address is wrong
> 3) other cases.
>
> Does this mean the device can DMA to the previous owner?
Yes, but only to the mappings which were already pinned for DMA for this
device, and the old owner is giving the new owner permission to DMA to that
memory even without bugs.
> If yes, does
> this have security implications?
No. The previous owner has given the new owner permission to take over. They
trust each other completely. In the live update case, they are the same; the new
owner is the old owner reincarnated.
- Steve
>> +
>> + /*
>> + * The current implementation does not support the platform iommu
>> + * with use_va. And if !use_va, remap is not necessary.
>> + */
>> + if (!ops->set_map && !ops->dma_map)
>> + return -EINVAL;
>> +
>> + /*
>> + * The current implementation supports set_map but not dma_map.
>> + */
>> + if (!ops->set_map)
>> + return -EINVAL;
>> +
>> + /*
>> + * Do not verify that the new size@uaddr points to the same physical
>> + * pages as the old size@uaddr, because that would take time O(npages),
>> + * and would increase guest down time during live update. If the app
>> + * is buggy and they differ, then the app may corrupt its own memory,
>> + * but no one else's.
>> + */
>> +
>> + /*
>> + * Batch will finish later in BATCH_END by calling set_map for the new
>> + * addresses collected here. Non-batch must do it now.
>> + */
>> + if (!v->in_batch)
>> + r = ops->set_map(vdpa, asid, iotlb);
>> + if (!r)
>> + map->addr = msg->uaddr;
>> +
>> + return r;
>> +}
>> +
>> static int vhost_vdpa_process_iotlb_update(struct vhost_vdpa *v,
>> struct vhost_iotlb *iotlb,
>> struct vhost_iotlb_msg *msg)
>> @@ -1336,6 +1391,9 @@ static int vhost_vdpa_process_iotlb_msg(struct vhost_dev *dev, u32 asid,
>> ops->set_map(vdpa, asid, iotlb);
>> v->in_batch = false;
>> break;
>> + case VHOST_IOTLB_REMAP:
>> + r = vhost_vdpa_process_iotlb_remap(v, iotlb, msg);
>> + break;
>> default:
>> r = -EINVAL;
>> break;
>> diff --git a/include/uapi/linux/vhost_types.h b/include/uapi/linux/vhost_types.h
>> index 9177843951e9..35908315ff55 100644
>> --- a/include/uapi/linux/vhost_types.h
>> +++ b/include/uapi/linux/vhost_types.h
>> @@ -79,7 +79,7 @@ struct vhost_iotlb_msg {
>> /*
>> * VHOST_IOTLB_BATCH_BEGIN and VHOST_IOTLB_BATCH_END allow modifying
>> * multiple mappings in one go: beginning with
>> - * VHOST_IOTLB_BATCH_BEGIN, followed by any number of
>> + * VHOST_IOTLB_BATCH_BEGIN, followed by any number of VHOST_IOTLB_REMAP or
>> * VHOST_IOTLB_UPDATE messages, and ending with VHOST_IOTLB_BATCH_END.
>> * When one of these two values is used as the message type, the rest
>> * of the fields in the message are ignored. There's no guarantee that
>> @@ -87,6 +87,15 @@ struct vhost_iotlb_msg {
>> */
>> #define VHOST_IOTLB_BATCH_BEGIN 5
>> #define VHOST_IOTLB_BATCH_END 6
>> +
>> +/*
>> + * VHOST_IOTLB_REMAP registers a new uaddr for the existing mapping at iova.
>> + * The new uaddr must address the same memory object as originally mapped.
>> + * Failure to do so will result in user memory corruption and/or device
>> + * misbehavior. iova and size must match the arguments used to create the
>> + * an existing mapping. Protection is not changed, and perm must be 0.
>> + */
>> +#define VHOST_IOTLB_REMAP 7
>> __u8 type;
>> };
>
> Thanks
>
>>
>> --
>> 2.39.3
>>
>
On Mon, Jul 15, 2024 at 10:28 PM Steven Sistare
<steven.sistare@oracle.com> wrote:
>
> On 7/14/2024 10:34 PM, Jason Wang wrote:
> > On Fri, Jul 12, 2024 at 9:19 PM Steve Sistare <steven.sistare@oracle.com> wrote:
> >>
> >> When device ownership is passed to a new process via VHOST_NEW_OWNER,
> >> some devices need to know the new userland addresses of the dma mappings.
> >> Define the new iotlb message type VHOST_IOTLB_REMAP to update the uaddr
> >> of a mapping. The new uaddr must address the same memory object as
> >> originally mapped.
> >>
> >> The user must suspend the device before the old address is invalidated,
> >> and cannot resume it until after VHOST_IOTLB_REMAP is called, but this
> >> requirement is not enforced by the API.
> >>
> >> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
> >> ---
> >> drivers/vhost/vdpa.c | 58 ++++++++++++++++++++++++++++++++
> >> include/uapi/linux/vhost_types.h | 11 +++++-
> >> 2 files changed, 68 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
> >> index 4396fe1a90c4..51f71c45c4a9 100644
> >> --- a/drivers/vhost/vdpa.c
> >> +++ b/drivers/vhost/vdpa.c
> >> @@ -1257,6 +1257,61 @@ static int vhost_vdpa_pa_map(struct vhost_vdpa *v,
> >>
> >> }
> >>
> >> +static int vhost_vdpa_process_iotlb_remap(struct vhost_vdpa *v,
> >> + struct vhost_iotlb *iotlb,
> >> + struct vhost_iotlb_msg *msg)
> >> +{
> >> + struct vdpa_device *vdpa = v->vdpa;
> >> + const struct vdpa_config_ops *ops = vdpa->config;
> >> + u32 asid = iotlb_to_asid(iotlb);
> >> + u64 start = msg->iova;
> >> + u64 last = start + msg->size - 1;
> >> + struct vhost_iotlb_map *map;
> >> + int r = 0;
> >> +
> >> + if (msg->perm || !msg->size)
> >> + return -EINVAL;
> >> +
> >> + map = vhost_iotlb_itree_first(iotlb, start, last);
> >> + if (!map)
> >> + return -ENOENT;
> >> +
> >> + if (map->start != start || map->last != last)
> >> + return -EINVAL;
> >
> > I had a question here, if a buggy user space that:
> >
> > 1) forget to update some of the mappings
> > 2) address is wrong
> > 3) other cases.
> >
> > Does this mean the device can DMA to the previous owner?
>
> Yes, but only to the mappings which were already pinned for DMA for this
> device, and the old owner is giving the new owner permission to DMA to that
> memory even without bugs.
>
> > If yes, does
> > this have security implications?
>
> No. The previous owner has given the new owner permission to take over. They
> trust each other completely. In the live update case, they are the same; the new
> owner is the old owner reincarnated.
I understand the processes may trust each other but I meant the kernel
may not trust those processes.
For example:
1) old owner pass fd to new owner which is another process
2) the new owner do VHOST_NEW_OWNER
3) new owner doesn't do remap correctly
There's no way for the old owner to remove/unpin the mappings as we
have the owner check in IOTLB_UPDATE. Looks like a potential way for
DOS.
Thanks
>
> - Steve
>
> >> +
> >> + /*
> >> + * The current implementation does not support the platform iommu
> >> + * with use_va. And if !use_va, remap is not necessary.
> >> + */
> >> + if (!ops->set_map && !ops->dma_map)
> >> + return -EINVAL;
> >> +
> >> + /*
> >> + * The current implementation supports set_map but not dma_map.
> >> + */
> >> + if (!ops->set_map)
> >> + return -EINVAL;
> >> +
> >> + /*
> >> + * Do not verify that the new size@uaddr points to the same physical
> >> + * pages as the old size@uaddr, because that would take time O(npages),
> >> + * and would increase guest down time during live update. If the app
> >> + * is buggy and they differ, then the app may corrupt its own memory,
> >> + * but no one else's.
> >> + */
> >> +
> >> + /*
> >> + * Batch will finish later in BATCH_END by calling set_map for the new
> >> + * addresses collected here. Non-batch must do it now.
> >> + */
> >> + if (!v->in_batch)
> >> + r = ops->set_map(vdpa, asid, iotlb);
> >> + if (!r)
> >> + map->addr = msg->uaddr;
> >> +
> >> + return r;
> >> +}
> >> +
> >> static int vhost_vdpa_process_iotlb_update(struct vhost_vdpa *v,
> >> struct vhost_iotlb *iotlb,
> >> struct vhost_iotlb_msg *msg)
> >> @@ -1336,6 +1391,9 @@ static int vhost_vdpa_process_iotlb_msg(struct vhost_dev *dev, u32 asid,
> >> ops->set_map(vdpa, asid, iotlb);
> >> v->in_batch = false;
> >> break;
> >> + case VHOST_IOTLB_REMAP:
> >> + r = vhost_vdpa_process_iotlb_remap(v, iotlb, msg);
> >> + break;
> >> default:
> >> r = -EINVAL;
> >> break;
> >> diff --git a/include/uapi/linux/vhost_types.h b/include/uapi/linux/vhost_types.h
> >> index 9177843951e9..35908315ff55 100644
> >> --- a/include/uapi/linux/vhost_types.h
> >> +++ b/include/uapi/linux/vhost_types.h
> >> @@ -79,7 +79,7 @@ struct vhost_iotlb_msg {
> >> /*
> >> * VHOST_IOTLB_BATCH_BEGIN and VHOST_IOTLB_BATCH_END allow modifying
> >> * multiple mappings in one go: beginning with
> >> - * VHOST_IOTLB_BATCH_BEGIN, followed by any number of
> >> + * VHOST_IOTLB_BATCH_BEGIN, followed by any number of VHOST_IOTLB_REMAP or
> >> * VHOST_IOTLB_UPDATE messages, and ending with VHOST_IOTLB_BATCH_END.
> >> * When one of these two values is used as the message type, the rest
> >> * of the fields in the message are ignored. There's no guarantee that
> >> @@ -87,6 +87,15 @@ struct vhost_iotlb_msg {
> >> */
> >> #define VHOST_IOTLB_BATCH_BEGIN 5
> >> #define VHOST_IOTLB_BATCH_END 6
> >> +
> >> +/*
> >> + * VHOST_IOTLB_REMAP registers a new uaddr for the existing mapping at iova.
> >> + * The new uaddr must address the same memory object as originally mapped.
> >> + * Failure to do so will result in user memory corruption and/or device
> >> + * misbehavior. iova and size must match the arguments used to create the
> >> + * an existing mapping. Protection is not changed, and perm must be 0.
> >> + */
> >> +#define VHOST_IOTLB_REMAP 7
> >> __u8 type;
> >> };
> >
> > Thanks
> >
> >>
> >> --
> >> 2.39.3
> >>
> >
>
On 7/16/2024 1:28 AM, Jason Wang wrote:
> On Mon, Jul 15, 2024 at 10:28 PM Steven Sistare
> <steven.sistare@oracle.com> wrote:
>>
>> On 7/14/2024 10:34 PM, Jason Wang wrote:
>>> On Fri, Jul 12, 2024 at 9:19 PM Steve Sistare <steven.sistare@oracle.com> wrote:
>>>>
>>>> When device ownership is passed to a new process via VHOST_NEW_OWNER,
>>>> some devices need to know the new userland addresses of the dma mappings.
>>>> Define the new iotlb message type VHOST_IOTLB_REMAP to update the uaddr
>>>> of a mapping. The new uaddr must address the same memory object as
>>>> originally mapped.
>>>>
>>>> The user must suspend the device before the old address is invalidated,
>>>> and cannot resume it until after VHOST_IOTLB_REMAP is called, but this
>>>> requirement is not enforced by the API.
>>>>
>>>> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
>>>> ---
>>>> drivers/vhost/vdpa.c | 58 ++++++++++++++++++++++++++++++++
>>>> include/uapi/linux/vhost_types.h | 11 +++++-
>>>> 2 files changed, 68 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
>>>> index 4396fe1a90c4..51f71c45c4a9 100644
>>>> --- a/drivers/vhost/vdpa.c
>>>> +++ b/drivers/vhost/vdpa.c
>>>> @@ -1257,6 +1257,61 @@ static int vhost_vdpa_pa_map(struct vhost_vdpa *v,
>>>>
>>>> }
>>>>
>>>> +static int vhost_vdpa_process_iotlb_remap(struct vhost_vdpa *v,
>>>> + struct vhost_iotlb *iotlb,
>>>> + struct vhost_iotlb_msg *msg)
>>>> +{
>>>> + struct vdpa_device *vdpa = v->vdpa;
>>>> + const struct vdpa_config_ops *ops = vdpa->config;
>>>> + u32 asid = iotlb_to_asid(iotlb);
>>>> + u64 start = msg->iova;
>>>> + u64 last = start + msg->size - 1;
>>>> + struct vhost_iotlb_map *map;
>>>> + int r = 0;
>>>> +
>>>> + if (msg->perm || !msg->size)
>>>> + return -EINVAL;
>>>> +
>>>> + map = vhost_iotlb_itree_first(iotlb, start, last);
>>>> + if (!map)
>>>> + return -ENOENT;
>>>> +
>>>> + if (map->start != start || map->last != last)
>>>> + return -EINVAL;
>>>
>>> I had a question here, if a buggy user space that:
>>>
>>> 1) forget to update some of the mappings
>>> 2) address is wrong
>>> 3) other cases.
>>>
>>> Does this mean the device can DMA to the previous owner?
>>
>> Yes, but only to the mappings which were already pinned for DMA for this
>> device, and the old owner is giving the new owner permission to DMA to that
>> memory even without bugs.
>>
>>> If yes, does
>>> this have security implications?
>>
>> No. The previous owner has given the new owner permission to take over. They
>> trust each other completely. In the live update case, they are the same; the new
>> owner is the old owner reincarnated.
>
> I understand the processes may trust each other but I meant the kernel
> may not trust those processes.
If a process holds the key (the fd) then the kernel can trust that is has
permission to use it. Keys are only passed around voluntarily, unless there
is a bug.
> For example:
>
> 1) old owner pass fd to new owner which is another process
> 2) the new owner do VHOST_NEW_OWNER
> 3) new owner doesn't do remap correctly
>
> There's no way for the old owner to remove/unpin the mappings as we
> have the owner check in IOTLB_UPDATE. Looks like a potential way for
> DOS.
This is a bug in the second cooperating process, not a DOS. The application
must fix it. Sometimes you cannot recover from an application bug at run time.
BTW, at one time vfio enforced the concept of an owner, but Alex deleted it.
It adds no value, because possession of the fd is the key.
ffed0518d871 ("vfio: remove useless judgement")
- Steve
>>>> +
>>>> + /*
>>>> + * The current implementation does not support the platform iommu
>>>> + * with use_va. And if !use_va, remap is not necessary.
>>>> + */
>>>> + if (!ops->set_map && !ops->dma_map)
>>>> + return -EINVAL;
>>>> +
>>>> + /*
>>>> + * The current implementation supports set_map but not dma_map.
>>>> + */
>>>> + if (!ops->set_map)
>>>> + return -EINVAL;
>>>> +
>>>> + /*
>>>> + * Do not verify that the new size@uaddr points to the same physical
>>>> + * pages as the old size@uaddr, because that would take time O(npages),
>>>> + * and would increase guest down time during live update. If the app
>>>> + * is buggy and they differ, then the app may corrupt its own memory,
>>>> + * but no one else's.
>>>> + */
>>>> +
>>>> + /*
>>>> + * Batch will finish later in BATCH_END by calling set_map for the new
>>>> + * addresses collected here. Non-batch must do it now.
>>>> + */
>>>> + if (!v->in_batch)
>>>> + r = ops->set_map(vdpa, asid, iotlb);
>>>> + if (!r)
>>>> + map->addr = msg->uaddr;
>>>> +
>>>> + return r;
>>>> +}
>>>> +
>>>> static int vhost_vdpa_process_iotlb_update(struct vhost_vdpa *v,
>>>> struct vhost_iotlb *iotlb,
>>>> struct vhost_iotlb_msg *msg)
>>>> @@ -1336,6 +1391,9 @@ static int vhost_vdpa_process_iotlb_msg(struct vhost_dev *dev, u32 asid,
>>>> ops->set_map(vdpa, asid, iotlb);
>>>> v->in_batch = false;
>>>> break;
>>>> + case VHOST_IOTLB_REMAP:
>>>> + r = vhost_vdpa_process_iotlb_remap(v, iotlb, msg);
>>>> + break;
>>>> default:
>>>> r = -EINVAL;
>>>> break;
>>>> diff --git a/include/uapi/linux/vhost_types.h b/include/uapi/linux/vhost_types.h
>>>> index 9177843951e9..35908315ff55 100644
>>>> --- a/include/uapi/linux/vhost_types.h
>>>> +++ b/include/uapi/linux/vhost_types.h
>>>> @@ -79,7 +79,7 @@ struct vhost_iotlb_msg {
>>>> /*
>>>> * VHOST_IOTLB_BATCH_BEGIN and VHOST_IOTLB_BATCH_END allow modifying
>>>> * multiple mappings in one go: beginning with
>>>> - * VHOST_IOTLB_BATCH_BEGIN, followed by any number of
>>>> + * VHOST_IOTLB_BATCH_BEGIN, followed by any number of VHOST_IOTLB_REMAP or
>>>> * VHOST_IOTLB_UPDATE messages, and ending with VHOST_IOTLB_BATCH_END.
>>>> * When one of these two values is used as the message type, the rest
>>>> * of the fields in the message are ignored. There's no guarantee that
>>>> @@ -87,6 +87,15 @@ struct vhost_iotlb_msg {
>>>> */
>>>> #define VHOST_IOTLB_BATCH_BEGIN 5
>>>> #define VHOST_IOTLB_BATCH_END 6
>>>> +
>>>> +/*
>>>> + * VHOST_IOTLB_REMAP registers a new uaddr for the existing mapping at iova.
>>>> + * The new uaddr must address the same memory object as originally mapped.
>>>> + * Failure to do so will result in user memory corruption and/or device
>>>> + * misbehavior. iova and size must match the arguments used to create the
>>>> + * an existing mapping. Protection is not changed, and perm must be 0.
>>>> + */
>>>> +#define VHOST_IOTLB_REMAP 7
>>>> __u8 type;
>>>> };
>>>
>>> Thanks
>>>
>>>>
>>>> --
>>>> 2.39.3
>>>>
>>>
>>
>
On Thu, Jul 18, 2024 at 2:29 AM Steven Sistare
<steven.sistare@oracle.com> wrote:
>
> On 7/16/2024 1:28 AM, Jason Wang wrote:
> > On Mon, Jul 15, 2024 at 10:28 PM Steven Sistare
> > <steven.sistare@oracle.com> wrote:
> >>
> >> On 7/14/2024 10:34 PM, Jason Wang wrote:
> >>> On Fri, Jul 12, 2024 at 9:19 PM Steve Sistare <steven.sistare@oracle.com> wrote:
> >>>>
> >>>> When device ownership is passed to a new process via VHOST_NEW_OWNER,
> >>>> some devices need to know the new userland addresses of the dma mappings.
> >>>> Define the new iotlb message type VHOST_IOTLB_REMAP to update the uaddr
> >>>> of a mapping. The new uaddr must address the same memory object as
> >>>> originally mapped.
> >>>>
> >>>> The user must suspend the device before the old address is invalidated,
> >>>> and cannot resume it until after VHOST_IOTLB_REMAP is called, but this
> >>>> requirement is not enforced by the API.
> >>>>
> >>>> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
> >>>> ---
> >>>> drivers/vhost/vdpa.c | 58 ++++++++++++++++++++++++++++++++
> >>>> include/uapi/linux/vhost_types.h | 11 +++++-
> >>>> 2 files changed, 68 insertions(+), 1 deletion(-)
> >>>>
> >>>> diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
> >>>> index 4396fe1a90c4..51f71c45c4a9 100644
> >>>> --- a/drivers/vhost/vdpa.c
> >>>> +++ b/drivers/vhost/vdpa.c
> >>>> @@ -1257,6 +1257,61 @@ static int vhost_vdpa_pa_map(struct vhost_vdpa *v,
> >>>>
> >>>> }
> >>>>
> >>>> +static int vhost_vdpa_process_iotlb_remap(struct vhost_vdpa *v,
> >>>> + struct vhost_iotlb *iotlb,
> >>>> + struct vhost_iotlb_msg *msg)
> >>>> +{
> >>>> + struct vdpa_device *vdpa = v->vdpa;
> >>>> + const struct vdpa_config_ops *ops = vdpa->config;
> >>>> + u32 asid = iotlb_to_asid(iotlb);
> >>>> + u64 start = msg->iova;
> >>>> + u64 last = start + msg->size - 1;
> >>>> + struct vhost_iotlb_map *map;
> >>>> + int r = 0;
> >>>> +
> >>>> + if (msg->perm || !msg->size)
> >>>> + return -EINVAL;
> >>>> +
> >>>> + map = vhost_iotlb_itree_first(iotlb, start, last);
> >>>> + if (!map)
> >>>> + return -ENOENT;
> >>>> +
> >>>> + if (map->start != start || map->last != last)
> >>>> + return -EINVAL;
> >>>
> >>> I had a question here, if a buggy user space that:
> >>>
> >>> 1) forget to update some of the mappings
> >>> 2) address is wrong
> >>> 3) other cases.
> >>>
> >>> Does this mean the device can DMA to the previous owner?
> >>
> >> Yes, but only to the mappings which were already pinned for DMA for this
> >> device, and the old owner is giving the new owner permission to DMA to that
> >> memory even without bugs.
> >>
> >>> If yes, does
> >>> this have security implications?
> >>
> >> No. The previous owner has given the new owner permission to take over. They
> >> trust each other completely. In the live update case, they are the same; the new
> >> owner is the old owner reincarnated.
> >
> > I understand the processes may trust each other but I meant the kernel
> > may not trust those processes.
>
> If a process holds the key (the fd) then the kernel can trust that is has
> permission to use it. Keys are only passed around voluntarily, unless there
> is a bug.
Looks not, for example, kernel can choose to limit various operations
on a fd, even if the process holds the key
1) privileged process do setup on fd, passing that fd to unprivileged fd
2) unprivileged process can only use a subset of the functions of a fd
In the case of Qemu, it prevents the kernel from the case where for
example malicious guests can escape to Qemu.
In the case of vhost-net, the privilege is the owner. For example, the
following seems to be valid in the case of vhost-net:
1) Two processes (A and B) share a part of the memory
2) A is the owner of the vhost-net who is in charge of building memory
mappings via IOTLB
3) A passess vhost-net fd to process B
>
> > For example:
> >
> > 1) old owner pass fd to new owner which is another process
> > 2) the new owner do VHOST_NEW_OWNER
> > 3) new owner doesn't do remap correctly
> >
> > There's no way for the old owner to remove/unpin the mappings as we
> > have the owner check in IOTLB_UPDATE. Looks like a potential way for
> > DOS.
>
> This is a bug in the second cooperating process, not a DOS. The application
> must fix it. Sometimes you cannot recover from an application bug at run time.
>
> BTW, at one time vfio enforced the concept of an owner, but Alex deleted it.
> It adds no value, because possession of the fd is the key.
> ffed0518d871 ("vfio: remove useless judgement")
This seems to be a great relaxation of the ownership check. I would
like to hear from Michael first.
Thanks
>
> - Steve
>
> >>>> +
> >>>> + /*
> >>>> + * The current implementation does not support the platform iommu
> >>>> + * with use_va. And if !use_va, remap is not necessary.
> >>>> + */
> >>>> + if (!ops->set_map && !ops->dma_map)
> >>>> + return -EINVAL;
> >>>> +
> >>>> + /*
> >>>> + * The current implementation supports set_map but not dma_map.
> >>>> + */
> >>>> + if (!ops->set_map)
> >>>> + return -EINVAL;
> >>>> +
> >>>> + /*
> >>>> + * Do not verify that the new size@uaddr points to the same physical
> >>>> + * pages as the old size@uaddr, because that would take time O(npages),
> >>>> + * and would increase guest down time during live update. If the app
> >>>> + * is buggy and they differ, then the app may corrupt its own memory,
> >>>> + * but no one else's.
> >>>> + */
> >>>> +
> >>>> + /*
> >>>> + * Batch will finish later in BATCH_END by calling set_map for the new
> >>>> + * addresses collected here. Non-batch must do it now.
> >>>> + */
> >>>> + if (!v->in_batch)
> >>>> + r = ops->set_map(vdpa, asid, iotlb);
> >>>> + if (!r)
> >>>> + map->addr = msg->uaddr;
> >>>> +
> >>>> + return r;
> >>>> +}
> >>>> +
> >>>> static int vhost_vdpa_process_iotlb_update(struct vhost_vdpa *v,
> >>>> struct vhost_iotlb *iotlb,
> >>>> struct vhost_iotlb_msg *msg)
> >>>> @@ -1336,6 +1391,9 @@ static int vhost_vdpa_process_iotlb_msg(struct vhost_dev *dev, u32 asid,
> >>>> ops->set_map(vdpa, asid, iotlb);
> >>>> v->in_batch = false;
> >>>> break;
> >>>> + case VHOST_IOTLB_REMAP:
> >>>> + r = vhost_vdpa_process_iotlb_remap(v, iotlb, msg);
> >>>> + break;
> >>>> default:
> >>>> r = -EINVAL;
> >>>> break;
> >>>> diff --git a/include/uapi/linux/vhost_types.h b/include/uapi/linux/vhost_types.h
> >>>> index 9177843951e9..35908315ff55 100644
> >>>> --- a/include/uapi/linux/vhost_types.h
> >>>> +++ b/include/uapi/linux/vhost_types.h
> >>>> @@ -79,7 +79,7 @@ struct vhost_iotlb_msg {
> >>>> /*
> >>>> * VHOST_IOTLB_BATCH_BEGIN and VHOST_IOTLB_BATCH_END allow modifying
> >>>> * multiple mappings in one go: beginning with
> >>>> - * VHOST_IOTLB_BATCH_BEGIN, followed by any number of
> >>>> + * VHOST_IOTLB_BATCH_BEGIN, followed by any number of VHOST_IOTLB_REMAP or
> >>>> * VHOST_IOTLB_UPDATE messages, and ending with VHOST_IOTLB_BATCH_END.
> >>>> * When one of these two values is used as the message type, the rest
> >>>> * of the fields in the message are ignored. There's no guarantee that
> >>>> @@ -87,6 +87,15 @@ struct vhost_iotlb_msg {
> >>>> */
> >>>> #define VHOST_IOTLB_BATCH_BEGIN 5
> >>>> #define VHOST_IOTLB_BATCH_END 6
> >>>> +
> >>>> +/*
> >>>> + * VHOST_IOTLB_REMAP registers a new uaddr for the existing mapping at iova.
> >>>> + * The new uaddr must address the same memory object as originally mapped.
> >>>> + * Failure to do so will result in user memory corruption and/or device
> >>>> + * misbehavior. iova and size must match the arguments used to create the
> >>>> + * an existing mapping. Protection is not changed, and perm must be 0.
> >>>> + */
> >>>> +#define VHOST_IOTLB_REMAP 7
> >>>> __u8 type;
> >>>> };
> >>>
> >>> Thanks
> >>>
> >>>>
> >>>> --
> >>>> 2.39.3
> >>>>
> >>>
> >>
> >
>
On Thu, Jul 18, 2024 at 08:45:31AM +0800, Jason Wang wrote:
> > > For example:
> > >
> > > 1) old owner pass fd to new owner which is another process
> > > 2) the new owner do VHOST_NEW_OWNER
> > > 3) new owner doesn't do remap correctly
> > >
> > > There's no way for the old owner to remove/unpin the mappings as we
> > > have the owner check in IOTLB_UPDATE. Looks like a potential way for
> > > DOS.
> >
> > This is a bug in the second cooperating process, not a DOS. The application
> > must fix it. Sometimes you cannot recover from an application bug at run time.
> >
> > BTW, at one time vfio enforced the concept of an owner, but Alex deleted it.
> > It adds no value, because possession of the fd is the key.
> > ffed0518d871 ("vfio: remove useless judgement")
>
> This seems to be a great relaxation of the ownership check. I would
> like to hear from Michael first.
>
> Thanks
It could be that the ownership model is too restrictive.
But again, this is changing a security assumption.
Looks like yes another reason to tie this to the switch to iommufd.
--
MST
On 7/18/2024 3:39 PM, Michael S. Tsirkin wrote:
> On Thu, Jul 18, 2024 at 08:45:31AM +0800, Jason Wang wrote:
>>>> For example:
>>>>
>>>> 1) old owner pass fd to new owner which is another process
>>>> 2) the new owner do VHOST_NEW_OWNER
>>>> 3) new owner doesn't do remap correctly
>>>>
>>>> There's no way for the old owner to remove/unpin the mappings as we
>>>> have the owner check in IOTLB_UPDATE. Looks like a potential way for
>>>> DOS.
>>>
>>> This is a bug in the second cooperating process, not a DOS. The application
>>> must fix it. Sometimes you cannot recover from an application bug at run time.
>>>
>>> BTW, at one time vfio enforced the concept of an owner, but Alex deleted it.
>>> It adds no value, because possession of the fd is the key.
>>> ffed0518d871 ("vfio: remove useless judgement")
>>
>> This seems to be a great relaxation of the ownership check. I would
>> like to hear from Michael first.
>>
>> Thanks
>
> It could be that the ownership model is too restrictive.
> But again, this is changing a security assumption.
> Looks like yes another reason to tie this to the switch to iommufd.
iommufd, like vfio, does not impose an ownership requirement. If vdpa has a
stricter requirement, such as allowing the vhost-net sharing that Jason
described, then we need to surface that now, and extend it to allow change
of ownership for live update.
Is the vhost-net scenario currently used, or aspirational?
Copying from Jason's email:
1) Two processes (A and B) share a part of the memory
2) A is the owner of the vhost-net who is in charge of building memory
mappings via IOTLB
3) A passes vhost-net fd to process B
- Steve
On Fri, Jul 19, 2024 at 4:19 AM Steven Sistare
<steven.sistare@oracle.com> wrote:
>
> On 7/18/2024 3:39 PM, Michael S. Tsirkin wrote:
> > On Thu, Jul 18, 2024 at 08:45:31AM +0800, Jason Wang wrote:
> >>>> For example:
> >>>>
> >>>> 1) old owner pass fd to new owner which is another process
> >>>> 2) the new owner do VHOST_NEW_OWNER
> >>>> 3) new owner doesn't do remap correctly
> >>>>
> >>>> There's no way for the old owner to remove/unpin the mappings as we
> >>>> have the owner check in IOTLB_UPDATE. Looks like a potential way for
> >>>> DOS.
> >>>
> >>> This is a bug in the second cooperating process, not a DOS. The application
> >>> must fix it. Sometimes you cannot recover from an application bug at run time.
> >>>
> >>> BTW, at one time vfio enforced the concept of an owner, but Alex deleted it.
> >>> It adds no value, because possession of the fd is the key.
> >>> ffed0518d871 ("vfio: remove useless judgement")
> >>
> >> This seems to be a great relaxation of the ownership check. I would
> >> like to hear from Michael first.
> >>
> >> Thanks
> >
> > It could be that the ownership model is too restrictive.
> > But again, this is changing a security assumption.
> > Looks like yes another reason to tie this to the switch to iommufd.
>
> iommufd, like vfio, does not impose an ownership requirement. If vdpa has a
> stricter requirement, such as allowing the vhost-net sharing that Jason
> described, then we need to surface that now, and extend it to allow change
> of ownership for live update.
>
> Is the vhost-net scenario currently used, or aspirational?
This question is very hard. But for my understanding, what Michael meant is:
1) the current proposal changes the security assumption
2) iommufd require uAPI changes which may imply security assumption
changes or other user space noticeable behaviour changes
My understanding is:
If we want to go without iommufd, we may need to find a way to stick
to the assumption (not sure how hard it is). If we want to go with
iommufd, we probably won't worry about that, as you've pointed out,
there's no ownership requirement.
> Copying from Jason's email:
> 1) Two processes (A and B) share a part of the memory
> 2) A is the owner of the vhost-net who is in charge of building memory
> mappings via IOTLB
> 3) A passes vhost-net fd to process B
>
> - Steve
>
Thanks
© 2016 - 2025 Red Hat, Inc.