A 4K page appears to be able to hold 128 ioreq entries, which luckly
matches the current vCPU limit. However, if we decide to increase the
domain vCPU limit, that doesn't hold anymore and this function would now
silently create a out of bounds pointer leading to confusing problems.
All architectures have no more than 128 as vCPU limit on HVM guests,
and have pages that are at most 4 KB, so this case doesn't occurs in
with the current limits.
Assert if the vCPU ID will lead to a out of bounds pointer.
No functional change.
Reported-by: Julian Vetter <julian.vetter@vates.tech>
Signed-off-by: Teddy Astie <teddy.astie@vates.tech>
---
Not sure if this is the best approach, perhaps preventing compilation if the
vCPU limit is higher than what the ioreq page can hold is preferable ?
xen/common/ioreq.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/xen/common/ioreq.c b/xen/common/ioreq.c
index f5fd30ce12..b2ef46ed7b 100644
--- a/xen/common/ioreq.c
+++ b/xen/common/ioreq.c
@@ -99,6 +99,7 @@ static ioreq_t *get_ioreq(struct ioreq_server *s, struct vcpu *v)
ASSERT((v == current) || !vcpu_runnable(v));
ASSERT(p != NULL);
+ ASSERT(v->vcpu_id < (PAGE_SIZE / sizeof(struct ioreq)));
return &p->vcpu_ioreq[v->vcpu_id];
}
--
2.51.2
--
Teddy Astie | Vates XCP-ng Developer
XCP-ng & Xen Orchestra - Vates solutions
web: https://vates.tech
On 14/11/2025 1:05 pm, Teddy Astie wrote: > A 4K page appears to be able to hold 128 ioreq entries, which luckly > matches the current vCPU limit. However, if we decide to increase the > domain vCPU limit, that doesn't hold anymore and this function would now > silently create a out of bounds pointer leading to confusing problems. > > All architectures have no more than 128 as vCPU limit on HVM guests, > and have pages that are at most 4 KB, so this case doesn't occurs in > with the current limits. > > Assert if the vCPU ID will lead to a out of bounds pointer. > > No functional change. > > Reported-by: Julian Vetter <julian.vetter@vates.tech> > Signed-off-by: Teddy Astie <teddy.astie@vates.tech> > --- > Not sure if this is the best approach, perhaps preventing compilation if the > vCPU limit is higher than what the ioreq page can hold is preferable ? > > xen/common/ioreq.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/xen/common/ioreq.c b/xen/common/ioreq.c > index f5fd30ce12..b2ef46ed7b 100644 > --- a/xen/common/ioreq.c > +++ b/xen/common/ioreq.c > @@ -99,6 +99,7 @@ static ioreq_t *get_ioreq(struct ioreq_server *s, struct vcpu *v) > > ASSERT((v == current) || !vcpu_runnable(v)); > ASSERT(p != NULL); > + ASSERT(v->vcpu_id < (PAGE_SIZE / sizeof(struct ioreq))); > > return &p->vcpu_ioreq[v->vcpu_id]; > } The 128 vCPU problem was known and IOERQ servers intentionally had multi-page capabilities from the outset to address this problem. See the calculation in ioreq_server_max_frames(). It hasn't been exercised in anger because there's still the APIC ID limit at 128 still, but IOREQ servers are expected to work correctly above this limit. That said, this function is clearly buggy and in need of fixing. To the assert specifically, that's really not appropriate. If it were to ever fire, we'd have an XSA. Logic if this nature either needs to be fail safe (if returning NULL is ok), or be a BUG() so it doesn't become unsafe in release builds. ~Andrew
Le 14/11/2025 à 14:24, Andrew Cooper a écrit : > On 14/11/2025 1:05 pm, Teddy Astie wrote: >> A 4K page appears to be able to hold 128 ioreq entries, which luckly >> matches the current vCPU limit. However, if we decide to increase the >> domain vCPU limit, that doesn't hold anymore and this function would now >> silently create a out of bounds pointer leading to confusing problems. >> >> All architectures have no more than 128 as vCPU limit on HVM guests, >> and have pages that are at most 4 KB, so this case doesn't occurs in >> with the current limits. >> >> Assert if the vCPU ID will lead to a out of bounds pointer. >> >> No functional change. >> >> Reported-by: Julian Vetter <julian.vetter@vates.tech> >> Signed-off-by: Teddy Astie <teddy.astie@vates.tech> >> --- >> Not sure if this is the best approach, perhaps preventing compilation if the >> vCPU limit is higher than what the ioreq page can hold is preferable ? >> >> xen/common/ioreq.c | 1 + >> 1 file changed, 1 insertion(+) >> >> diff --git a/xen/common/ioreq.c b/xen/common/ioreq.c >> index f5fd30ce12..b2ef46ed7b 100644 >> --- a/xen/common/ioreq.c >> +++ b/xen/common/ioreq.c >> @@ -99,6 +99,7 @@ static ioreq_t *get_ioreq(struct ioreq_server *s, struct vcpu *v) >> >> ASSERT((v == current) || !vcpu_runnable(v)); >> ASSERT(p != NULL); >> + ASSERT(v->vcpu_id < (PAGE_SIZE / sizeof(struct ioreq))); >> >> return &p->vcpu_ioreq[v->vcpu_id]; >> } > > The 128 vCPU problem was known and IOERQ servers intentionally had > multi-page capabilities from the outset to address this problem. > > See the calculation in ioreq_server_max_frames(). > > It hasn't been exercised in anger because there's still the APIC ID > limit at 128 still, but IOREQ servers are expected to work correctly > above this limit. Yes, although only the first ioreq page exists and can be used (even though ioreq_server_max_frames may report more). It looks like the logic is incomplete there. > That said, this function is clearly buggy and in need > of fixing. > > To the assert specifically, that's really not appropriate. If it were > to ever fire, we'd have an XSA. Logic if this nature either needs to be > fail safe (if returning NULL is ok), or be a BUG() so it doesn't become > unsafe in release builds. > It looks like it is possible to somewhat gracefully handle this, by e.g crashing the offending domain instead of stepping in a assert. I'm trying this for v2. > ~Andrew -- Teddy Astie | Vates XCP-ng Developer XCP-ng & Xen Orchestra - Vates solutions web: https://vates.tech
© 2016 - 2025 Red Hat, Inc.