[Qemu-devel] [PATCH 0/2] GICv2 & GICv3: RAZ/WI reserved addresses rather than aborting

Peter Maydell posted 2 patches 7 years, 10 months ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/1513183941-24300-1-git-send-email-peter.maydell@linaro.org
Test checkpatch passed
Test docker passed
Test ppc passed
Test s390x passed
hw/intc/arm_gic.c              |  5 +++--
hw/intc/arm_gicv3_dist.c       | 13 +++++++++++++
hw/intc/arm_gicv3_its_common.c |  8 +++-----
hw/intc/arm_gicv3_redist.c     | 13 +++++++++++++
4 files changed, 32 insertions(+), 7 deletions(-)
[Qemu-devel] [PATCH 0/2] GICv2 & GICv3: RAZ/WI reserved addresses rather than aborting
Posted by Peter Maydell 7 years, 10 months ago
The GICv2 and GICv3 specifications say that reserved register
addresses should RAZ/WI.  This means we need to return MEMTX_OK, not
MEMTX_ERROR, because now that we support generating external aborts
the latter will cause an abort on new board models.

In particular, at least some versions of UEFI try to
access a reserved address in the GICv3 redistributor
(at SGI_base + 0x184) and fail to boot on the virt board
without this.

thanks
-- PMM

Peter Maydell (2):
  hw/intc/arm_gicv3: Make reserved register addresses RAZ/WI
  hw/intc/arm_gic: reserved register addresses are RAZ/WI

 hw/intc/arm_gic.c              |  5 +++--
 hw/intc/arm_gicv3_dist.c       | 13 +++++++++++++
 hw/intc/arm_gicv3_its_common.c |  8 +++-----
 hw/intc/arm_gicv3_redist.c     | 13 +++++++++++++
 4 files changed, 32 insertions(+), 7 deletions(-)

-- 
2.7.4


Re: [Qemu-devel] [Qemu-arm] [PATCH 0/2] GICv2 & GICv3: RAZ/WI reserved addresses rather than aborting
Posted by Peter Maydell 7 years, 10 months ago
On 13 December 2017 at 16:52, Peter Maydell <peter.maydell@linaro.org> wrote:
> The GICv2 and GICv3 specifications say that reserved register
> addresses should RAZ/WI.  This means we need to return MEMTX_OK, not
> MEMTX_ERROR, because now that we support generating external aborts
> the latter will cause an abort on new board models.
>
> In particular, at least some versions of UEFI try to
> access a reserved address in the GICv3 redistributor
> (at SGI_base + 0x184) and fail to boot on the virt board
> without this.
>
> thanks
> -- PMM
>
> Peter Maydell (2):
>   hw/intc/arm_gicv3: Make reserved register addresses RAZ/WI
>   hw/intc/arm_gic: reserved register addresses are RAZ/WI

Oops, forgot the cc-stable tags:

Cc: qemu-stable@nongnu.org

thanks
-- PMM

Re: [Qemu-devel] [Qemu-arm] [PATCH 0/2] GICv2 & GICv3: RAZ/WI reserved addresses rather than aborting
Posted by Ard Biesheuvel 7 years, 10 months ago
On 13 December 2017 at 16:54, Peter Maydell <peter.maydell@linaro.org> wrote:
> On 13 December 2017 at 16:52, Peter Maydell <peter.maydell@linaro.org> wrote:
>> The GICv2 and GICv3 specifications say that reserved register
>> addresses should RAZ/WI.  This means we need to return MEMTX_OK, not
>> MEMTX_ERROR, because now that we support generating external aborts
>> the latter will cause an abort on new board models.
>>
>> In particular, at least some versions of UEFI try to
>> access a reserved address in the GICv3 redistributor
>> (at SGI_base + 0x184) and fail to boot on the virt board
>> without this.
>>

Could this commit

https://github.com/tianocore/edk2/commit/28f8d28faabf50a82ef8d137308592c64ea9e2b6

have anything to do with that?

Re: [Qemu-devel] [Qemu-arm] [PATCH 0/2] GICv2 & GICv3: RAZ/WI reserved addresses rather than aborting
Posted by Peter Maydell 7 years, 9 months ago
Ping for code review?

thanks
-- PMM

On 13 December 2017 at 16:52, Peter Maydell <peter.maydell@linaro.org> wrote:
> The GICv2 and GICv3 specifications say that reserved register
> addresses should RAZ/WI.  This means we need to return MEMTX_OK, not
> MEMTX_ERROR, because now that we support generating external aborts
> the latter will cause an abort on new board models.
>
> In particular, at least some versions of UEFI try to
> access a reserved address in the GICv3 redistributor
> (at SGI_base + 0x184) and fail to boot on the virt board
> without this.
>
> thanks
> -- PMM
>
> Peter Maydell (2):
>   hw/intc/arm_gicv3: Make reserved register addresses RAZ/WI
>   hw/intc/arm_gic: reserved register addresses are RAZ/WI
>
>  hw/intc/arm_gic.c              |  5 +++--
>  hw/intc/arm_gicv3_dist.c       | 13 +++++++++++++
>  hw/intc/arm_gicv3_its_common.c |  8 +++-----
>  hw/intc/arm_gicv3_redist.c     | 13 +++++++++++++
>  4 files changed, 32 insertions(+), 7 deletions(-)

Re: [Qemu-devel] [Qemu-arm] [PATCH 0/2] GICv2 & GICv3: RAZ/WI reserved addresses rather than aborting
Posted by Laszlo Ersek 7 years, 9 months ago
On 01/09/18 15:24, Peter Maydell wrote:
> Ping for code review?
> 
> thanks
> -- PMM
> 
> On 13 December 2017 at 16:52, Peter Maydell <peter.maydell@linaro.org> wrote:
>> The GICv2 and GICv3 specifications

here's where I had started running from your patch...

>> say that reserved register
>> addresses should RAZ/WI.

...and I remarked only looking over my shoulder that "RAZ/WI" is not a
verb :)

Sorry, no clue about any of this -- where should I read up?	

Ard did ask a question though:

https://www.mail-archive.com/qemu-devel@nongnu.org/msg500055.html

CC'ing Drew, Eric and Wei. In particular (IIUC) Eric has been looking
into a weird "why do we have so many (and so slow) aborts with GICv4
emulation" issue, at launching UEFI.

Thanks
Laszlo

>> This means we need to return MEMTX_OK, not
>> MEMTX_ERROR, because now that we support generating external aborts
>> the latter will cause an abort on new board models.
>>
>> In particular, at least some versions of UEFI try to
>> access a reserved address in the GICv3 redistributor
>> (at SGI_base + 0x184) and fail to boot on the virt board
>> without this.
>>
>> thanks
>> -- PMM
>>
>> Peter Maydell (2):
>>   hw/intc/arm_gicv3: Make reserved register addresses RAZ/WI
>>   hw/intc/arm_gic: reserved register addresses are RAZ/WI
>>
>>  hw/intc/arm_gic.c              |  5 +++--
>>  hw/intc/arm_gicv3_dist.c       | 13 +++++++++++++
>>  hw/intc/arm_gicv3_its_common.c |  8 +++-----
>>  hw/intc/arm_gicv3_redist.c     | 13 +++++++++++++
>>  4 files changed, 32 insertions(+), 7 deletions(-)


Re: [Qemu-devel] [Qemu-arm] [PATCH 0/2] GICv2 & GICv3: RAZ/WI reserved addresses rather than aborting
Posted by Peter Maydell 7 years, 9 months ago
On 9 January 2018 at 15:58, Laszlo Ersek <lersek@redhat.com> wrote:
> Sorry, no clue about any of this -- where should I read up?

I cc'd you mostly as a heads-up since the QEMU bug is UEFI affecting,
not because I wanted to make you read the GIC specs :-)

> Ard did ask a question though:
>
> https://www.mail-archive.com/qemu-devel@nongnu.org/msg500055.html

Sounds plausible (my UEFI binary I hit this with is pretty ancient)
but I don't know for certain. It's one of those things that seems
like it's a bug in UEFI (perhaps now fixed) but which is also
definitely a bug in QEMU, and if it is a UEFI bug it's pretty
harmless.

thanks
-- PMM

Re: [Qemu-devel] [Qemu-arm] [PATCH 0/2] GICv2 & GICv3: RAZ/WI reserved addresses rather than aborting
Posted by Laszlo Ersek 7 years, 9 months ago
On 01/09/18 17:12, Peter Maydell wrote:
> On 9 January 2018 at 15:58, Laszlo Ersek <lersek@redhat.com> wrote:
>> Sorry, no clue about any of this -- where should I read up?
> 
> I cc'd you mostly as a heads-up since the QEMU bug is UEFI affecting,
> not because I wanted to make you read the GIC specs :-)

Thanks (and, thanks :) ) -- from patch #2, looks like GICv2 is affected
too, and the patch seems to be fixing commit a9d853533cc1
("hw/intc/arm_gic: Switch to read/write callbacks with tx attributes",
2015-05-12).

Is that right? That commit was released with v2.4.0. Should I have
experienced the error? Is it KVM / hardware specific? What are the symptoms?

>> Ard did ask a question though:
>>
>> https://www.mail-archive.com/qemu-devel@nongnu.org/msg500055.html
> 
> Sounds plausible (my UEFI binary I hit this with is pretty ancient)
> but I don't know for certain. It's one of those things that seems
> like it's a bug in UEFI (perhaps now fixed) but which is also
> definitely a bug in QEMU, and if it is a UEFI bug it's pretty
> harmless.

... I don't know the symptoms of the issue either that was fixed by
<https://github.com/tianocore/edk2/commit/28f8d28faabf50a82ef8d137308592c64ea9e2b6>.
Guest crashes with unhandled data abort? (I.e., impossible not to notice.)

Thanks!
Laszlo

Re: [Qemu-devel] [Qemu-arm] [PATCH 0/2] GICv2 & GICv3: RAZ/WI reserved addresses rather than aborting
Posted by Peter Maydell 7 years, 9 months ago
On 9 January 2018 at 16:29, Laszlo Ersek <lersek@redhat.com> wrote:
> On 01/09/18 17:12, Peter Maydell wrote:
>> On 9 January 2018 at 15:58, Laszlo Ersek <lersek@redhat.com> wrote:
>>> Sorry, no clue about any of this -- where should I read up?
>>
>> I cc'd you mostly as a heads-up since the QEMU bug is UEFI affecting,
>> not because I wanted to make you read the GIC specs :-)
>
> Thanks (and, thanks :) ) -- from patch #2, looks like GICv2 is affected
> too, and the patch seems to be fixing commit a9d853533cc1
> ("hw/intc/arm_gic: Switch to read/write callbacks with tx attributes",
> 2015-05-12).
>
> Is that right? That commit was released with v2.4.0. Should I have
> experienced the error? Is it KVM / hardware specific? What are the symptoms?

Happens under TCG emulation only. The bug got introduced with
commit c79c0a314c43b78f6, which changed the effect of a device
returning MEMTX_ERROR/MEMTX_DECODE_ERROR from "RAZ/WI" to
"guest data abort". That's in general the right thing to do,
but in the case of these device models we were returning those
values for cases which aren't supposed to provoke aborts.

The symptom is that if your guest code is poorly behaved and
tries to read or write offsets that don't correspond to documented
GIC registers, it will take an abort that it didn't before.
Linux is fine; this UEFI image I have lying around stopped working.

thanks
-- PMM

Re: [Qemu-devel] [Qemu-arm] [PATCH 0/2] GICv2 & GICv3: RAZ/WI reserved addresses rather than aborting
Posted by Laszlo Ersek 7 years, 9 months ago
On 01/09/18 17:35, Peter Maydell wrote:
> On 9 January 2018 at 16:29, Laszlo Ersek <lersek@redhat.com> wrote:
>> On 01/09/18 17:12, Peter Maydell wrote:
>>> On 9 January 2018 at 15:58, Laszlo Ersek <lersek@redhat.com> wrote:
>>>> Sorry, no clue about any of this -- where should I read up?
>>>
>>> I cc'd you mostly as a heads-up since the QEMU bug is UEFI affecting,
>>> not because I wanted to make you read the GIC specs :-)
>>
>> Thanks (and, thanks :) ) -- from patch #2, looks like GICv2 is affected
>> too, and the patch seems to be fixing commit a9d853533cc1
>> ("hw/intc/arm_gic: Switch to read/write callbacks with tx attributes",
>> 2015-05-12).
>>
>> Is that right? That commit was released with v2.4.0. Should I have
>> experienced the error? Is it KVM / hardware specific? What are the symptoms?
> 
> Happens under TCG emulation only. The bug got introduced with
> commit c79c0a314c43b78f6, which changed the effect of a device
> returning MEMTX_ERROR/MEMTX_DECODE_ERROR from "RAZ/WI" to
> "guest data abort". That's in general the right thing to do,
> but in the case of these device models we were returning those
> values for cases which aren't supposed to provoke aborts.
> 
> The symptom is that if your guest code is poorly behaved and
> tries to read or write offsets that don't correspond to documented
> GIC registers, it will take an abort that it didn't before.
> Linux is fine; this UEFI image I have lying around stopped working.

Thank you for the description. Now I'm thinking the failure indeed needs
combining both bugs (UEFI -- presumed --, and QEMU). I've been using
qemu-system-aarch64, version 2.11, on my x86_64 laptop, for a while,
with no problems at all -- it's a registered RHEL-ALT-7.4 guest that I
last booted three days ago. And, v2.11.0 has c79c0a314c43b78f6. On the
other hand, I tend to run bleeding-edge ArmVirtQemu firmware.

... I see that your patches restore the GIC behavior to (sort-of)
pre-a9d853533cc1 state, thereby de-fanging the over-generic
c79c0a314c43b78f6. (Does this summary make sense?) If you think an ACK
from me isn't worthless for reflecting this "realization" (lol), I can
provide it.

Thanks!
Laszlo

Re: [Qemu-devel] [Qemu-arm] [PATCH 0/2] GICv2 & GICv3: RAZ/WI reserved addresses rather than aborting
Posted by Andrew Jones 7 years, 9 months ago
On Tue, Jan 09, 2018 at 04:35:30PM +0000, Peter Maydell wrote:
> On 9 January 2018 at 16:29, Laszlo Ersek <lersek@redhat.com> wrote:
> > On 01/09/18 17:12, Peter Maydell wrote:
> >> On 9 January 2018 at 15:58, Laszlo Ersek <lersek@redhat.com> wrote:
> >>> Sorry, no clue about any of this -- where should I read up?
> >>
> >> I cc'd you mostly as a heads-up since the QEMU bug is UEFI affecting,
> >> not because I wanted to make you read the GIC specs :-)
> >
> > Thanks (and, thanks :) ) -- from patch #2, looks like GICv2 is affected
> > too, and the patch seems to be fixing commit a9d853533cc1
> > ("hw/intc/arm_gic: Switch to read/write callbacks with tx attributes",
> > 2015-05-12).
> >
> > Is that right? That commit was released with v2.4.0. Should I have
> > experienced the error? Is it KVM / hardware specific? What are the symptoms?
> 
> Happens under TCG emulation only. The bug got introduced with
> commit c79c0a314c43b78f6, which changed the effect of a device
> returning MEMTX_ERROR/MEMTX_DECODE_ERROR from "RAZ/WI" to
> "guest data abort". That's in general the right thing to do,
> but in the case of these device models we were returning those
> values for cases which aren't supposed to provoke aborts.
> 
> The symptom is that if your guest code is poorly behaved and
> tries to read or write offsets that don't correspond to documented
> GIC registers, it will take an abort that it didn't before.
> Linux is fine; this UEFI image I have lying around stopped working.

Thanks for pointing this out. I had just recently noticed that the
'gicv3-active' kvm-unit-tests test had stopped passing on TCG, getting
DABTs when attempting to write GICD_ICACTIVER. I was just about to
complain in this thread that that register is indeed documented, but
now I see the implementation of the test was always wrong. It was using
the wrong register base, RD vs. SGI...

drew