The GICv2 and GICv3 specifications say that reserved register addresses should RAZ/WI. This means we need to return MEMTX_OK, not MEMTX_ERROR, because now that we support generating external aborts the latter will cause an abort on new board models. In particular, at least some versions of UEFI try to access a reserved address in the GICv3 redistributor (at SGI_base + 0x184) and fail to boot on the virt board without this. thanks -- PMM Peter Maydell (2): hw/intc/arm_gicv3: Make reserved register addresses RAZ/WI hw/intc/arm_gic: reserved register addresses are RAZ/WI hw/intc/arm_gic.c | 5 +++-- hw/intc/arm_gicv3_dist.c | 13 +++++++++++++ hw/intc/arm_gicv3_its_common.c | 8 +++----- hw/intc/arm_gicv3_redist.c | 13 +++++++++++++ 4 files changed, 32 insertions(+), 7 deletions(-) -- 2.7.4
On 13 December 2017 at 16:52, Peter Maydell <peter.maydell@linaro.org> wrote: > The GICv2 and GICv3 specifications say that reserved register > addresses should RAZ/WI. This means we need to return MEMTX_OK, not > MEMTX_ERROR, because now that we support generating external aborts > the latter will cause an abort on new board models. > > In particular, at least some versions of UEFI try to > access a reserved address in the GICv3 redistributor > (at SGI_base + 0x184) and fail to boot on the virt board > without this. > > thanks > -- PMM > > Peter Maydell (2): > hw/intc/arm_gicv3: Make reserved register addresses RAZ/WI > hw/intc/arm_gic: reserved register addresses are RAZ/WI Oops, forgot the cc-stable tags: Cc: qemu-stable@nongnu.org thanks -- PMM
On 13 December 2017 at 16:54, Peter Maydell <peter.maydell@linaro.org> wrote: > On 13 December 2017 at 16:52, Peter Maydell <peter.maydell@linaro.org> wrote: >> The GICv2 and GICv3 specifications say that reserved register >> addresses should RAZ/WI. This means we need to return MEMTX_OK, not >> MEMTX_ERROR, because now that we support generating external aborts >> the latter will cause an abort on new board models. >> >> In particular, at least some versions of UEFI try to >> access a reserved address in the GICv3 redistributor >> (at SGI_base + 0x184) and fail to boot on the virt board >> without this. >> Could this commit https://github.com/tianocore/edk2/commit/28f8d28faabf50a82ef8d137308592c64ea9e2b6 have anything to do with that?
Ping for code review? thanks -- PMM On 13 December 2017 at 16:52, Peter Maydell <peter.maydell@linaro.org> wrote: > The GICv2 and GICv3 specifications say that reserved register > addresses should RAZ/WI. This means we need to return MEMTX_OK, not > MEMTX_ERROR, because now that we support generating external aborts > the latter will cause an abort on new board models. > > In particular, at least some versions of UEFI try to > access a reserved address in the GICv3 redistributor > (at SGI_base + 0x184) and fail to boot on the virt board > without this. > > thanks > -- PMM > > Peter Maydell (2): > hw/intc/arm_gicv3: Make reserved register addresses RAZ/WI > hw/intc/arm_gic: reserved register addresses are RAZ/WI > > hw/intc/arm_gic.c | 5 +++-- > hw/intc/arm_gicv3_dist.c | 13 +++++++++++++ > hw/intc/arm_gicv3_its_common.c | 8 +++----- > hw/intc/arm_gicv3_redist.c | 13 +++++++++++++ > 4 files changed, 32 insertions(+), 7 deletions(-)
On 01/09/18 15:24, Peter Maydell wrote: > Ping for code review? > > thanks > -- PMM > > On 13 December 2017 at 16:52, Peter Maydell <peter.maydell@linaro.org> wrote: >> The GICv2 and GICv3 specifications here's where I had started running from your patch... >> say that reserved register >> addresses should RAZ/WI. ...and I remarked only looking over my shoulder that "RAZ/WI" is not a verb :) Sorry, no clue about any of this -- where should I read up? Ard did ask a question though: https://www.mail-archive.com/qemu-devel@nongnu.org/msg500055.html CC'ing Drew, Eric and Wei. In particular (IIUC) Eric has been looking into a weird "why do we have so many (and so slow) aborts with GICv4 emulation" issue, at launching UEFI. Thanks Laszlo >> This means we need to return MEMTX_OK, not >> MEMTX_ERROR, because now that we support generating external aborts >> the latter will cause an abort on new board models. >> >> In particular, at least some versions of UEFI try to >> access a reserved address in the GICv3 redistributor >> (at SGI_base + 0x184) and fail to boot on the virt board >> without this. >> >> thanks >> -- PMM >> >> Peter Maydell (2): >> hw/intc/arm_gicv3: Make reserved register addresses RAZ/WI >> hw/intc/arm_gic: reserved register addresses are RAZ/WI >> >> hw/intc/arm_gic.c | 5 +++-- >> hw/intc/arm_gicv3_dist.c | 13 +++++++++++++ >> hw/intc/arm_gicv3_its_common.c | 8 +++----- >> hw/intc/arm_gicv3_redist.c | 13 +++++++++++++ >> 4 files changed, 32 insertions(+), 7 deletions(-)
On 9 January 2018 at 15:58, Laszlo Ersek <lersek@redhat.com> wrote: > Sorry, no clue about any of this -- where should I read up? I cc'd you mostly as a heads-up since the QEMU bug is UEFI affecting, not because I wanted to make you read the GIC specs :-) > Ard did ask a question though: > > https://www.mail-archive.com/qemu-devel@nongnu.org/msg500055.html Sounds plausible (my UEFI binary I hit this with is pretty ancient) but I don't know for certain. It's one of those things that seems like it's a bug in UEFI (perhaps now fixed) but which is also definitely a bug in QEMU, and if it is a UEFI bug it's pretty harmless. thanks -- PMM
On 01/09/18 17:12, Peter Maydell wrote:
> On 9 January 2018 at 15:58, Laszlo Ersek <lersek@redhat.com> wrote:
>> Sorry, no clue about any of this -- where should I read up?
>
> I cc'd you mostly as a heads-up since the QEMU bug is UEFI affecting,
> not because I wanted to make you read the GIC specs :-)
Thanks (and, thanks :) ) -- from patch #2, looks like GICv2 is affected
too, and the patch seems to be fixing commit a9d853533cc1
("hw/intc/arm_gic: Switch to read/write callbacks with tx attributes",
2015-05-12).
Is that right? That commit was released with v2.4.0. Should I have
experienced the error? Is it KVM / hardware specific? What are the symptoms?
>> Ard did ask a question though:
>>
>> https://www.mail-archive.com/qemu-devel@nongnu.org/msg500055.html
>
> Sounds plausible (my UEFI binary I hit this with is pretty ancient)
> but I don't know for certain. It's one of those things that seems
> like it's a bug in UEFI (perhaps now fixed) but which is also
> definitely a bug in QEMU, and if it is a UEFI bug it's pretty
> harmless.
... I don't know the symptoms of the issue either that was fixed by
<https://github.com/tianocore/edk2/commit/28f8d28faabf50a82ef8d137308592c64ea9e2b6>.
Guest crashes with unhandled data abort? (I.e., impossible not to notice.)
Thanks!
Laszlo
On 9 January 2018 at 16:29, Laszlo Ersek <lersek@redhat.com> wrote:
> On 01/09/18 17:12, Peter Maydell wrote:
>> On 9 January 2018 at 15:58, Laszlo Ersek <lersek@redhat.com> wrote:
>>> Sorry, no clue about any of this -- where should I read up?
>>
>> I cc'd you mostly as a heads-up since the QEMU bug is UEFI affecting,
>> not because I wanted to make you read the GIC specs :-)
>
> Thanks (and, thanks :) ) -- from patch #2, looks like GICv2 is affected
> too, and the patch seems to be fixing commit a9d853533cc1
> ("hw/intc/arm_gic: Switch to read/write callbacks with tx attributes",
> 2015-05-12).
>
> Is that right? That commit was released with v2.4.0. Should I have
> experienced the error? Is it KVM / hardware specific? What are the symptoms?
Happens under TCG emulation only. The bug got introduced with
commit c79c0a314c43b78f6, which changed the effect of a device
returning MEMTX_ERROR/MEMTX_DECODE_ERROR from "RAZ/WI" to
"guest data abort". That's in general the right thing to do,
but in the case of these device models we were returning those
values for cases which aren't supposed to provoke aborts.
The symptom is that if your guest code is poorly behaved and
tries to read or write offsets that don't correspond to documented
GIC registers, it will take an abort that it didn't before.
Linux is fine; this UEFI image I have lying around stopped working.
thanks
-- PMM
On 01/09/18 17:35, Peter Maydell wrote:
> On 9 January 2018 at 16:29, Laszlo Ersek <lersek@redhat.com> wrote:
>> On 01/09/18 17:12, Peter Maydell wrote:
>>> On 9 January 2018 at 15:58, Laszlo Ersek <lersek@redhat.com> wrote:
>>>> Sorry, no clue about any of this -- where should I read up?
>>>
>>> I cc'd you mostly as a heads-up since the QEMU bug is UEFI affecting,
>>> not because I wanted to make you read the GIC specs :-)
>>
>> Thanks (and, thanks :) ) -- from patch #2, looks like GICv2 is affected
>> too, and the patch seems to be fixing commit a9d853533cc1
>> ("hw/intc/arm_gic: Switch to read/write callbacks with tx attributes",
>> 2015-05-12).
>>
>> Is that right? That commit was released with v2.4.0. Should I have
>> experienced the error? Is it KVM / hardware specific? What are the symptoms?
>
> Happens under TCG emulation only. The bug got introduced with
> commit c79c0a314c43b78f6, which changed the effect of a device
> returning MEMTX_ERROR/MEMTX_DECODE_ERROR from "RAZ/WI" to
> "guest data abort". That's in general the right thing to do,
> but in the case of these device models we were returning those
> values for cases which aren't supposed to provoke aborts.
>
> The symptom is that if your guest code is poorly behaved and
> tries to read or write offsets that don't correspond to documented
> GIC registers, it will take an abort that it didn't before.
> Linux is fine; this UEFI image I have lying around stopped working.
Thank you for the description. Now I'm thinking the failure indeed needs
combining both bugs (UEFI -- presumed --, and QEMU). I've been using
qemu-system-aarch64, version 2.11, on my x86_64 laptop, for a while,
with no problems at all -- it's a registered RHEL-ALT-7.4 guest that I
last booted three days ago. And, v2.11.0 has c79c0a314c43b78f6. On the
other hand, I tend to run bleeding-edge ArmVirtQemu firmware.
... I see that your patches restore the GIC behavior to (sort-of)
pre-a9d853533cc1 state, thereby de-fanging the over-generic
c79c0a314c43b78f6. (Does this summary make sense?) If you think an ACK
from me isn't worthless for reflecting this "realization" (lol), I can
provide it.
Thanks!
Laszlo
On Tue, Jan 09, 2018 at 04:35:30PM +0000, Peter Maydell wrote:
> On 9 January 2018 at 16:29, Laszlo Ersek <lersek@redhat.com> wrote:
> > On 01/09/18 17:12, Peter Maydell wrote:
> >> On 9 January 2018 at 15:58, Laszlo Ersek <lersek@redhat.com> wrote:
> >>> Sorry, no clue about any of this -- where should I read up?
> >>
> >> I cc'd you mostly as a heads-up since the QEMU bug is UEFI affecting,
> >> not because I wanted to make you read the GIC specs :-)
> >
> > Thanks (and, thanks :) ) -- from patch #2, looks like GICv2 is affected
> > too, and the patch seems to be fixing commit a9d853533cc1
> > ("hw/intc/arm_gic: Switch to read/write callbacks with tx attributes",
> > 2015-05-12).
> >
> > Is that right? That commit was released with v2.4.0. Should I have
> > experienced the error? Is it KVM / hardware specific? What are the symptoms?
>
> Happens under TCG emulation only. The bug got introduced with
> commit c79c0a314c43b78f6, which changed the effect of a device
> returning MEMTX_ERROR/MEMTX_DECODE_ERROR from "RAZ/WI" to
> "guest data abort". That's in general the right thing to do,
> but in the case of these device models we were returning those
> values for cases which aren't supposed to provoke aborts.
>
> The symptom is that if your guest code is poorly behaved and
> tries to read or write offsets that don't correspond to documented
> GIC registers, it will take an abort that it didn't before.
> Linux is fine; this UEFI image I have lying around stopped working.
Thanks for pointing this out. I had just recently noticed that the
'gicv3-active' kvm-unit-tests test had stopped passing on TCG, getting
DABTs when attempting to write GICD_ICACTIVER. I was just about to
complain in this thread that that register is indeed documented, but
now I see the implementation of the test was always wrong. It was using
the wrong register base, RD vs. SGI...
drew
© 2016 - 2025 Red Hat, Inc.