[PATCH v3 9/9] vhost: add WARNING if log_num is more than limit

Dongli Zhang posted 9 patches 8 months, 2 weeks ago
[PATCH v3 9/9] vhost: add WARNING if log_num is more than limit
Posted by Dongli Zhang 8 months, 2 weeks ago
Since long time ago, the only user of vq->log is vhost-net. The concern is
to add support for more devices (i.e. vhost-scsi or vsock) may reveals
unknown issue in the vhost API. Add a WARNING.

Suggested-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
---
 drivers/vhost/vhost.c | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index 494b3da5423a..b7d51d569646 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -2559,6 +2559,15 @@ static int get_indirect(struct vhost_virtqueue *vq,
 		if (access == VHOST_ACCESS_WO) {
 			*in_num += ret;
 			if (unlikely(log && ret)) {
+				/*
+				 * Since long time ago, the only user of
+				 * vq->log is vhost-net. The concern is to
+				 * add support for more devices (i.e.
+				 * vhost-scsi or vsock) may reveals unknown
+				 * issue in the vhost API. Add a WARNING.
+				 */
+				WARN_ON_ONCE(*log_num >= vq->dev->iov_limit);
+
 				log[*log_num].addr = vhost64_to_cpu(vq, desc.addr);
 				log[*log_num].len = vhost32_to_cpu(vq, desc.len);
 				++*log_num;
@@ -2679,6 +2688,15 @@ int vhost_get_vq_desc(struct vhost_virtqueue *vq,
 			 * increment that count. */
 			*in_num += ret;
 			if (unlikely(log && ret)) {
+				/*
+				 * Since long time ago, the only user of
+				 * vq->log is vhost-net. The concern is to
+				 * add support for more devices (i.e.
+				 * vhost-scsi or vsock) may reveals unknown
+				 * issue in the vhost API. Add a WARNING.
+				 */
+				WARN_ON_ONCE(*log_num >= vq->dev->iov_limit);
+
 				log[*log_num].addr = vhost64_to_cpu(vq, desc.addr);
 				log[*log_num].len = vhost32_to_cpu(vq, desc.len);
 				++*log_num;
-- 
2.39.3
Re: [PATCH v3 9/9] vhost: add WARNING if log_num is more than limit
Posted by Michael S. Tsirkin 8 months ago
On Wed, Apr 02, 2025 at 11:29:54PM -0700, Dongli Zhang wrote:
> Since long time ago, the only user of vq->log is vhost-net. The concern is
> to add support for more devices (i.e. vhost-scsi or vsock) may reveals
> unknown issue in the vhost API. Add a WARNING.
> 
> Suggested-by: Joao Martins <joao.m.martins@oracle.com>
> Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>


Userspace can trigger this I think, this is a problem since
people run with reboot on warn.
Pls grammar issues in comments... I don't think so.

> ---
>  drivers/vhost/vhost.c | 18 ++++++++++++++++++
>  1 file changed, 18 insertions(+)
> 
> diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
> index 494b3da5423a..b7d51d569646 100644
> --- a/drivers/vhost/vhost.c
> +++ b/drivers/vhost/vhost.c
> @@ -2559,6 +2559,15 @@ static int get_indirect(struct vhost_virtqueue *vq,
>  		if (access == VHOST_ACCESS_WO) {
>  			*in_num += ret;
>  			if (unlikely(log && ret)) {
> +				/*
> +				 * Since long time ago, the only user of
> +				 * vq->log is vhost-net. The concern is to
> +				 * add support for more devices (i.e.
> +				 * vhost-scsi or vsock) may reveals unknown
> +				 * issue in the vhost API. Add a WARNING.
> +				 */
> +				WARN_ON_ONCE(*log_num >= vq->dev->iov_limit);
> +
>  				log[*log_num].addr = vhost64_to_cpu(vq, desc.addr);
>  				log[*log_num].len = vhost32_to_cpu(vq, desc.len);
>  				++*log_num;
> @@ -2679,6 +2688,15 @@ int vhost_get_vq_desc(struct vhost_virtqueue *vq,
>  			 * increment that count. */
>  			*in_num += ret;
>  			if (unlikely(log && ret)) {
> +				/*
> +				 * Since long time ago, the only user of
> +				 * vq->log is vhost-net. The concern is to
> +				 * add support for more devices (i.e.
> +				 * vhost-scsi or vsock) may reveals unknown
> +				 * issue in the vhost API. Add a WARNING.
> +				 */
> +				WARN_ON_ONCE(*log_num >= vq->dev->iov_limit);
> +
>  				log[*log_num].addr = vhost64_to_cpu(vq, desc.addr);
>  				log[*log_num].len = vhost32_to_cpu(vq, desc.len);
>  				++*log_num;
> -- 
> 2.39.3
Re: [PATCH v3 9/9] vhost: add WARNING if log_num is more than limit
Posted by Dongli Zhang 8 months ago
Hi Michael,

On 4/14/25 9:32 AM, Michael S. Tsirkin wrote:
> On Wed, Apr 02, 2025 at 11:29:54PM -0700, Dongli Zhang wrote:
>> Since long time ago, the only user of vq->log is vhost-net. The concern is
>> to add support for more devices (i.e. vhost-scsi or vsock) may reveals
>> unknown issue in the vhost API. Add a WARNING.
>>
>> Suggested-by: Joao Martins <joao.m.martins@oracle.com>
>> Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
> 
> 
> Userspace can trigger this I think, this is a problem since
> people run with reboot on warn.

I think it will be a severe kernel bug (page fault) if userspace can trigger this.

If (*log_num >= vq->dev->iov_limit), the next line will lead to an out-of-bound
memory access:

    log[*log_num].addr = vhost64_to_cpu(vq, desc.addr);

I could not propose a case to trigger the WARNING from userspace. Would you mind
helping explain if that can happen?

> Pls grammar issues in comments... I don't think so.

I did an analysis of code and so far I could not identify any case to trigger
(*log_num >= vq->dev->iov_limit).

The objective of the patch is to add a WARNING to double confirm the case won't
happen.

Regarding "I don't think so", would you mean we don't need this patch/WARNING
because the code is robust enough?

Thank you very much!

Dongli Zhang

> 
>> ---
>>  drivers/vhost/vhost.c | 18 ++++++++++++++++++
>>  1 file changed, 18 insertions(+)
>>
>> diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
>> index 494b3da5423a..b7d51d569646 100644
>> --- a/drivers/vhost/vhost.c
>> +++ b/drivers/vhost/vhost.c
>> @@ -2559,6 +2559,15 @@ static int get_indirect(struct vhost_virtqueue *vq,
>>  		if (access == VHOST_ACCESS_WO) {
>>  			*in_num += ret;
>>  			if (unlikely(log && ret)) {
>> +				/*
>> +				 * Since long time ago, the only user of
>> +				 * vq->log is vhost-net. The concern is to
>> +				 * add support for more devices (i.e.
>> +				 * vhost-scsi or vsock) may reveals unknown
>> +				 * issue in the vhost API. Add a WARNING.
>> +				 */
>> +				WARN_ON_ONCE(*log_num >= vq->dev->iov_limit);
>> +
>>  				log[*log_num].addr = vhost64_to_cpu(vq, desc.addr);
>>  				log[*log_num].len = vhost32_to_cpu(vq, desc.len);
>>  				++*log_num;
>> @@ -2679,6 +2688,15 @@ int vhost_get_vq_desc(struct vhost_virtqueue *vq,
>>  			 * increment that count. */
>>  			*in_num += ret;
>>  			if (unlikely(log && ret)) {
>> +				/*
>> +				 * Since long time ago, the only user of
>> +				 * vq->log is vhost-net. The concern is to
>> +				 * add support for more devices (i.e.
>> +				 * vhost-scsi or vsock) may reveals unknown
>> +				 * issue in the vhost API. Add a WARNING.
>> +				 */
>> +				WARN_ON_ONCE(*log_num >= vq->dev->iov_limit);
>> +
>>  				log[*log_num].addr = vhost64_to_cpu(vq, desc.addr);
>>  				log[*log_num].len = vhost32_to_cpu(vq, desc.len);
>>  				++*log_num;
>> -- 
>> 2.39.3
>
Re: [PATCH v3 9/9] vhost: add WARNING if log_num is more than limit
Posted by Michael S. Tsirkin 8 months ago
On Mon, Apr 14, 2025 at 09:52:04AM -0700, Dongli Zhang wrote:
> Hi Michael,
> 
> On 4/14/25 9:32 AM, Michael S. Tsirkin wrote:
> > On Wed, Apr 02, 2025 at 11:29:54PM -0700, Dongli Zhang wrote:
> >> Since long time ago, the only user of vq->log is vhost-net. The concern is
> >> to add support for more devices (i.e. vhost-scsi or vsock) may reveals
> >> unknown issue in the vhost API. Add a WARNING.
> >>
> >> Suggested-by: Joao Martins <joao.m.martins@oracle.com>
> >> Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
> > 
> > 
> > Userspace can trigger this I think, this is a problem since
> > people run with reboot on warn.
> 
> I think it will be a severe kernel bug (page fault) if userspace can trigger this.
> 
> If (*log_num >= vq->dev->iov_limit), the next line will lead to an out-of-bound
> memory access:
> 
>     log[*log_num].addr = vhost64_to_cpu(vq, desc.addr);
> 
> I could not propose a case to trigger the WARNING from userspace. Would you mind
> helping explain if that can happen?

Oh I see. the commit log made me think this is an actual issue,
not a debugging aid just in case.


> > Pls grammar issues in comments... I don't think so.
> 
> I did an analysis of code and so far I could not identify any case to trigger
> (*log_num >= vq->dev->iov_limit).
> 
> The objective of the patch is to add a WARNING to double confirm the case won't
> happen.
> 
> Regarding "I don't think so", would you mean we don't need this patch/WARNING
> because the code is robust enough?
> 
> Thank you very much!
> 
> Dongli Zhang


Let me clarify the comment is misleading.
All it has to say is:

	/* Let's make sure we are not out of bounds. */
	BUG_ON(*log_num >= vq->dev->iov_limit);

at the same time, this is unnecessary pointer chasing
on critical path, and I don't much like it that we are
making an assumption about array size here.

If you strongly want to do it, you must document it near
get_indirect: 
@log - array of size at least vq->dev->iov_limit


> > 
> >> ---
> >>  drivers/vhost/vhost.c | 18 ++++++++++++++++++
> >>  1 file changed, 18 insertions(+)
> >>
> >> diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
> >> index 494b3da5423a..b7d51d569646 100644
> >> --- a/drivers/vhost/vhost.c
> >> +++ b/drivers/vhost/vhost.c
> >> @@ -2559,6 +2559,15 @@ static int get_indirect(struct vhost_virtqueue *vq,
> >>  		if (access == VHOST_ACCESS_WO) {
> >>  			*in_num += ret;
> >>  			if (unlikely(log && ret)) {
> >> +				/*
> >> +				 * Since long time ago, the only user of
> >> +				 * vq->log is vhost-net. The concern is to
> >> +				 * add support for more devices (i.e.
> >> +				 * vhost-scsi or vsock) may reveals unknown
> >> +				 * issue in the vhost API. Add a WARNING.
> >> +				 */
> >> +				WARN_ON_ONCE(*log_num >= vq->dev->iov_limit);
> >> +
> >>  				log[*log_num].addr = vhost64_to_cpu(vq, desc.addr);
> >>  				log[*log_num].len = vhost32_to_cpu(vq, desc.len);
> >>  				++*log_num;
> >> @@ -2679,6 +2688,15 @@ int vhost_get_vq_desc(struct vhost_virtqueue *vq,
> >>  			 * increment that count. */
> >>  			*in_num += ret;
> >>  			if (unlikely(log && ret)) {
> >> +				/*
> >> +				 * Since long time ago, the only user of
> >> +				 * vq->log is vhost-net. The concern is to
> >> +				 * add support for more devices (i.e.
> >> +				 * vhost-scsi or vsock) may reveals unknown
> >> +				 * issue in the vhost API. Add a WARNING.
> >> +				 */
> >> +				WARN_ON_ONCE(*log_num >= vq->dev->iov_limit);
> >> +
> >>  				log[*log_num].addr = vhost64_to_cpu(vq, desc.addr);
> >>  				log[*log_num].len = vhost32_to_cpu(vq, desc.len);
> >>  				++*log_num;
> >> -- 
> >> 2.39.3
> >
Re: [PATCH v3 9/9] vhost: add WARNING if log_num is more than limit
Posted by Dongli Zhang 8 months ago
Hi Michael,

On 4/14/25 11:39 AM, Michael S. Tsirkin wrote:
> On Mon, Apr 14, 2025 at 09:52:04AM -0700, Dongli Zhang wrote:
>> Hi Michael,
>>
>> On 4/14/25 9:32 AM, Michael S. Tsirkin wrote:
>>> On Wed, Apr 02, 2025 at 11:29:54PM -0700, Dongli Zhang wrote:
>>>> Since long time ago, the only user of vq->log is vhost-net. The concern is
>>>> to add support for more devices (i.e. vhost-scsi or vsock) may reveals
>>>> unknown issue in the vhost API. Add a WARNING.
>>>>
>>>> Suggested-by: Joao Martins <joao.m.martins@oracle.com>
>>>> Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
>>>
>>>
>>> Userspace can trigger this I think, this is a problem since
>>> people run with reboot on warn.
>>
>> I think it will be a severe kernel bug (page fault) if userspace can trigger this.
>>
>> If (*log_num >= vq->dev->iov_limit), the next line will lead to an out-of-bound
>> memory access:
>>
>>     log[*log_num].addr = vhost64_to_cpu(vq, desc.addr);
>>
>> I could not propose a case to trigger the WARNING from userspace. Would you mind
>> helping explain if that can happen?
> 
> Oh I see. the commit log made me think this is an actual issue,
> not a debugging aid just in case.
> 
> 
>>> Pls grammar issues in comments... I don't think so.
>>
>> I did an analysis of code and so far I could not identify any case to trigger
>> (*log_num >= vq->dev->iov_limit).
>>
>> The objective of the patch is to add a WARNING to double confirm the case won't
>> happen.
>>
>> Regarding "I don't think so", would you mean we don't need this patch/WARNING
>> because the code is robust enough?
>>
>> Thank you very much!
>>
>> Dongli Zhang
> 
> 
> Let me clarify the comment is misleading.
> All it has to say is:
> 
> 	/* Let's make sure we are not out of bounds. */
> 	BUG_ON(*log_num >= vq->dev->iov_limit);

This is a critical path only during Live Migration is in progress. So far I
didn't encounter any issue for either vhost-net or vhost-scsi. That's why I used
WARN_ON_ONCE() instead of a BUG_ON().

I agree with your point on "unnecessary pointer chasing on critical path", I am
going to remove this patch in the next version.

Thank you very much for feedback and suggestion!

Dongli Zhang

> 
> at the same time, this is unnecessary pointer chasing
> on critical path, and I don't much like it that we are
> making an assumption about array size here.
> 
> If you strongly want to do it, you must document it near
> get_indirect: 
> @log - array of size at least vq->dev->iov_limit
>