[PATCH 0/4] target/ppc: Catch invalid real address accesses

Nicholas Piggin posted 4 patches 10 months, 4 weeks ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20230623081953.290875-1-npiggin@gmail.com
Maintainers: Daniel Henrique Barboza <danielhb413@gmail.com>, "Cédric Le Goater" <clg@kaod.org>, David Gibson <david@gibson.dropbear.id.au>, Greg Kurz <groug@kaod.org>
There is a newer version of this series
target/ppc/cpu_init.c    |   1 +
target/ppc/excp_helper.c | 178 +++++++++++++++++++--------------------
target/ppc/internal.h    |   5 ++
3 files changed, 95 insertions(+), 89 deletions(-)
[PATCH 0/4] target/ppc: Catch invalid real address accesses
Posted by Nicholas Piggin 10 months, 4 weeks ago
ppc has always silently ignored access to real (physical) addresses
with nothing behind it, which can make debugging difficult at times.

It looks like the way to handle this is implement the transaction
failed call, which most target architectures do. Notably not x86
though, I wonder why?

Other question is, sometimes I guess it's nice to avoid crashing in
order to try to quickly get past some unimplemented MMIO. Maybe a
command line option or something could turn it off? It should
probably be a QEMU-wide option if so, so that shouldn't hold this
series up, I can propose a option for that if anybody is worried
about it.

In any case this seems to work. I have only implemented it for books
64-bit for now because I'm not sure what all the others do. I could
try to investigate it if there is interest.

I took Zoltan's patch and tweaked it slightly (hopefully that's okay).

Also made a better checkstop while we're here because the old one
really doesn't work.

Thanks,
Nick

BALATON Zoltan (1):
  target/ppc: Move common check in machne check handlers to a function

Nicholas Piggin (3):
  target/ppc: Machine check on invalid real address access
  target/ppc: Add POWER9/10 invalid-real machine check codes
  target/ppc: Make checkstop stop the system

 target/ppc/cpu_init.c    |   1 +
 target/ppc/excp_helper.c | 178 +++++++++++++++++++--------------------
 target/ppc/internal.h    |   5 ++
 3 files changed, 95 insertions(+), 89 deletions(-)

-- 
2.40.1
Re: [PATCH 0/4] target/ppc: Catch invalid real address accesses
Posted by Peter Maydell 10 months, 4 weeks ago
On Fri, 23 Jun 2023 at 09:21, Nicholas Piggin <npiggin@gmail.com> wrote:
>
> ppc has always silently ignored access to real (physical) addresses
> with nothing behind it, which can make debugging difficult at times.
>
> It looks like the way to handle this is implement the transaction
> failed call, which most target architectures do. Notably not x86
> though, I wonder why?

Much of this is historical legacy. QEMU originally had no
concept of "the system outside the CPU returns some kind
of bus error and the CPU raises an exception for it".
This is turn is (I think) because the x86 PC doesn't do
that: you always get back some kind of response, I think
-1 on reads and writes ignored. We added the do_transaction_failed
hook largely because we wanted it to give more accurate
emulation of this kind of thing on Arm, but as usual with new
facilities we left the other architectures to do it themselves
if they wanted -- by default the behaviour remained the same.
Some architectures have picked it up; some haven't.

The main reason it's a bit of a pain to turn the correct
handling on is because often boards don't actually implement
all the devices they're supposed to. For a pile of legacy Arm
boards, especially where we didn't have good test images,
we use the machine flag ignore_memory_transaction_failures to
retain the legacy behaviour. (This isn't great because it's
pretty much going to mean we have that flag set on those
boards forever because nobody is going to care enough to
investigate and test.)

> Other question is, sometimes I guess it's nice to avoid crashing in
> order to try to quickly get past some unimplemented MMIO. Maybe a
> command line option or something could turn it off? It should
> probably be a QEMU-wide option if so, so that shouldn't hold this
> series up, I can propose a option for that if anybody is worried
> about it.

I would not recommend going any further than maybe setting the
ignore_memory_transaction_failures flag for boards you don't
care about. (But in an ideal world, don't set it and deal with
any bug reports by implementing stub versions of missing devices.
Depends how confident you are in your test coverage.)

thanks
-- PMM
Re: [PATCH 0/4] target/ppc: Catch invalid real address accesses
Posted by Nicholas Piggin 10 months, 3 weeks ago
On Fri Jun 23, 2023 at 7:10 PM AEST, Peter Maydell wrote:
> On Fri, 23 Jun 2023 at 09:21, Nicholas Piggin <npiggin@gmail.com> wrote:
> >
> > ppc has always silently ignored access to real (physical) addresses
> > with nothing behind it, which can make debugging difficult at times.
> >
> > It looks like the way to handle this is implement the transaction
> > failed call, which most target architectures do. Notably not x86
> > though, I wonder why?
>
> Much of this is historical legacy. QEMU originally had no
> concept of "the system outside the CPU returns some kind
> of bus error and the CPU raises an exception for it".
> This is turn is (I think) because the x86 PC doesn't do
> that: you always get back some kind of response, I think
> -1 on reads and writes ignored. We added the do_transaction_failed
> hook largely because we wanted it to give more accurate
> emulation of this kind of thing on Arm, but as usual with new
> facilities we left the other architectures to do it themselves
> if they wanted -- by default the behaviour remained the same.
> Some architectures have picked it up; some haven't.
>
> The main reason it's a bit of a pain to turn the correct
> handling on is because often boards don't actually implement
> all the devices they're supposed to. For a pile of legacy Arm
> boards, especially where we didn't have good test images,
> we use the machine flag ignore_memory_transaction_failures to
> retain the legacy behaviour. (This isn't great because it's
> pretty much going to mean we have that flag set on those
> boards forever because nobody is going to care enough to
> investigate and test.)
>
> > Other question is, sometimes I guess it's nice to avoid crashing in
> > order to try to quickly get past some unimplemented MMIO. Maybe a
> > command line option or something could turn it off? It should
> > probably be a QEMU-wide option if so, so that shouldn't hold this
> > series up, I can propose a option for that if anybody is worried
> > about it.
>
> I would not recommend going any further than maybe setting the
> ignore_memory_transaction_failures flag for boards you don't
> care about. (But in an ideal world, don't set it and deal with
> any bug reports by implementing stub versions of missing devices.
> Depends how confident you are in your test coverage.)

Thanks for the background, interesting and helpful. So I think
it is the right place for powerpc BookS 64 to hook into. Point
taken about adding a global option for it. Will try to fix the
known problems first, maybe it won't be too hard.

Thanks,
Nick
Re: [PATCH 0/4] target/ppc: Catch invalid real address accesses
Posted by Cédric Le Goater 10 months, 4 weeks ago
On 6/23/23 11:10, Peter Maydell wrote:
> On Fri, 23 Jun 2023 at 09:21, Nicholas Piggin <npiggin@gmail.com> wrote:
>>
>> ppc has always silently ignored access to real (physical) addresses
>> with nothing behind it, which can make debugging difficult at times.
>>
>> It looks like the way to handle this is implement the transaction
>> failed call, which most target architectures do. Notably not x86
>> though, I wonder why?
> 
> Much of this is historical legacy. QEMU originally had no
> concept of "the system outside the CPU returns some kind
> of bus error and the CPU raises an exception for it".
> This is turn is (I think) because the x86 PC doesn't do
> that: you always get back some kind of response, I think
> -1 on reads and writes ignored. We added the do_transaction_failed
> hook largely because we wanted it to give more accurate
> emulation of this kind of thing on Arm, but as usual with new
> facilities we left the other architectures to do it themselves
> if they wanted -- by default the behaviour remained the same.
> Some architectures have picked it up; some haven't.
> 
> The main reason it's a bit of a pain to turn the correct
> handling on is because often boards don't actually implement
> all the devices they're supposed to. For a pile of legacy Arm
> boards, especially where we didn't have good test images,
> we use the machine flag ignore_memory_transaction_failures to
> retain the legacy behaviour. (This isn't great because it's
> pretty much going to mean we have that flag set on those
> boards forever because nobody is going to care enough to
> investigate and test.)
> 
>> Other question is, sometimes I guess it's nice to avoid crashing in
>> order to try to quickly get past some unimplemented MMIO. Maybe a
>> command line option or something could turn it off? It should
>> probably be a QEMU-wide option if so, so that shouldn't hold this
>> series up, I can propose a option for that if anybody is worried
>> about it.
> 
> I would not recommend going any further than maybe setting the
> ignore_memory_transaction_failures flag for boards you don't
> care about. (But in an ideal world, don't set it and deal with
> any bug reports by implementing stub versions of missing devices.
> Depends how confident you are in your test coverage.)

It seems it broke the "mac99" and  powernv10 machines, using the
qemu-ppc-boot images which are mostly buildroot. See below for logs.

Adding Mark for further testing on Mac OS.

Thanks,

C.


Saving 256 bits of non-creditable seed for next boot
Starting network: [    3.876834] Disabling lock debugging due to kernel taint
[    3.877351] Oops: Machine check, sig: 7 [#1]
[    3.877697] BE PAGE_SIZE=4K MMU=Hash SMP NR_CPUS=4 PowerMac
[    3.878019] Modules linked in:
[    3.878233] CPU: 0 PID: 93 Comm: ip Tainted: G   M               6.1.14 #1
[    3.878403] Hardware name: PowerMac3,1 PPC970 0x390202 PowerMac
[    3.878554] NIP:  c0000000007f1c28 LR: c0000000007f18d0 CTR: 0000000000000000
[    3.878721] REGS: c00000000ffefd60 TRAP: 0200   Tainted: G   M                (6.1.14)
[    3.878922] MSR:  8000000000009032 <SF,EE,ME,IR,DR,RI>  CR: 44002884  XER: 00000000
[    3.879149] DAR: c0003e0080103010 DSISR: 42000000 IRQMASK: 0
[    3.879149] GPR00: c0000000007f10fc c0000000037f35e0 c000000000d4bc00 0000000000000024
[    3.879149] GPR04: c0003f00000cbdc8 0000000000000000 0000000000000000 c0000000037f35a6
[    3.879149] GPR08: 0000000000000000 c0003e0080103010 0000000000000000 c000000001513c58
[    3.879149] GPR12: 0000000084002884 c0000000015fa000 000000010d8b0460 000000011a042f13
[    3.879149] GPR16: 000000011a042f7f 000000011a042f87 000000011a04052c 000000011a042f0e
[    3.879149] GPR20: 0000000000000000 0000000000000000 0000000000000000 ffffffffffffffff
[    3.879149] GPR24: c0000000036f8e00 0000000000000000 c00000000319af08 c00000000319af10
[    3.879149] GPR28: c000000003797000 c000000003798000 c00000000319a880 c000000003924c00
[    3.880889] NIP [c0000000007f1c28] .gem_reinit_chip+0xc28/0xc70
[    3.881045] LR [c0000000007f18d0] .gem_reinit_chip+0x8d0/0xc70
[    3.881209] Call Trace:
[    3.881274] [c0000000037f35e0] [c0000000007f10fc] .gem_reinit_chip+0xfc/0xc70 (unreliable)
[    3.881484] [c0000000037f3680] [c0000000007f2df4] .gem_do_start+0x34/0x430
[    3.881658] [c0000000037f3730] [c0000000009396d0] .__dev_open+0x150/0x240
[    3.881835] [c0000000037f37e0] [c000000000939d1c] .__dev_change_flags+0x20c/0x2c0
[    3.882015] [c0000000037f38b0] [c000000000939dfc] .dev_change_flags+0x2c/0x90
[    3.882185] [c0000000037f3940] [c000000000a1dc88] .devinet_ioctl+0x2e8/0x9b0
[    3.882361] [c0000000037f3a20] [c000000000a20854] .inet_ioctl+0x184/0x2c0
[    3.882527] [c0000000037f3b30] [c0000000008fa5d8] .sock_do_ioctl+0x58/0x130
[    3.882693] [c0000000037f3c00] [c0000000008fb77c] .sock_ioctl+0x1cc/0x830
[    3.882876] [c0000000037f3ce0] [c0000000002c6bcc] .__se_sys_ioctl+0x10c/0x130
[    3.883066] [c0000000037f3d80] [c0000000000252e8] .system_call_exception+0x158/0x2d0
[    3.883255] [c0000000037f3e10] [c00000000000b2d4] system_call_common+0xf4/0x258
[    3.883436] --- interrupt: c00 at 0x3fff825da858
[    3.883553] NIP:  00003fff825da858 LR: 0000000119f7b3a0 CTR: 0000000000000000
[    3.883721] REGS: c0000000037f3e80 TRAP: 0c00   Tainted: G   M                (6.1.14)
[    3.883910] MSR:  800000000200f032 <SF,VEC,EE,PR,FP,ME,IR,DR,RI>  CR: 28002884  XER: 00000000
[    3.884149] IRQMASK: 0
[    3.884149] GPR00: 0000000000000036 00003fffc530af10 00003fff826b7400 0000000000000003
[    3.884149] GPR04: 0000000000008914 00003fffc530b0a0 000000011a040518 0000000000000000
[    3.884149] GPR08: 00000000803c7414 0000000000000000 0000000000000000 0000000000000000
[    3.884149] GPR12: 0000000000000000 00003fff827377e0 000000010d8b0460 000000011a042f13
[    3.884149] GPR16: 000000011a042f7f 000000011a042f87 000000011a04052c 000000011a042f0e
[    3.884149] GPR20: 0000000000000000 0000000000000000 0000000000000000 ffffffffffffffff
[    3.884149] GPR24: ffffffffffffffff ffffffffffffffff ffffffffffffffff 00003fffc530bf51
[    3.884149] GPR28: 00003fffc530b0a0 0000000000000001 0000000000000001 000000011a040518
[    3.885775] NIP [00003fff825da858] 0x3fff825da858
[    3.885902] LR [0000000119f7b3a0] 0x119f7b3a0
[    3.886006] --- interrupt: c00
[    3.886109] Instruction dump:
[    3.886296] 39290044 55290032 1d09fffe 7d485214 7d295050 915e01b8 913e01bc 4bfff628
[    3.886537] e93e0000 39400000 39293010 7c0004ac <7d404d2c> a14d02f8 5548043e 2c080000
[    3.886968] ---[ end trace 0000000000000000 ]---
[    3.887112]

and

[   34.443113577,5] XIVE: [ IC 00  ] Resetting one xive...
[   34.446516646,5] RESET: Fast reboot disabled: Kernel re-entered OPAL
[   34.447320802,3] OPAL exiting with locks held, pir=0000 token=80 retval=1
[   34.447483518,3]   hw/xive2.c:3955
[   34.447578862,5] CPU ATTEMPT TO RE-ENTER FIRMWARE! PIR=0000 cpu @0x31c10000 -> pir=0000 token=158
[   34.447781018,5] CPU ATTEMPT TO RE-ENTER FIRMWARE! PIR=0000 cpu @0x31c10000 -> pir=0000 token=158
[    0.000000][    C0] Disabling lock debugging due to kernel taint
[    0.000000][    C0] MCE: CPU0: machine check (Severe)  Real address Load (bad) DAR: 0006030201000038 [Not recovered]
[    0.000000][    C0] MCE: CPU0: NIP: [0000000030093290] 0x30093290
[    0.000000][    C0] MCE: CPU0: Initiator CPU
[    0.000000][    C0] MCE: CPU0: Hardware error
[    0.000000][    C0] opal: Hardware platform error: Unrecoverable Machine Check exception
[    0.000000][    C0] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G   M               6.1.14 #1
[    0.000000][    C0] Hardware name: IBM PowerNV (emulated by qemu) POWER10 0x801200 opal:v7.0 PowerNV
[    0.000000][    C0] NIP:  0000000030093290 LR: 0000000030093284 CTR: 0000000030100f20
[    0.000000][    C0] REGS: c00000003df8fd60 TRAP: 0200   Tainted: G   M                (6.1.14)
[    0.000000][    C0] MSR:  9000000002201002 <SF,HV,VEC,ME,RI>  CR: 84022424  XER: 20040000
[    0.000000][    C0] CFAR: 0000000030112dd8 DAR: 0006030201000038 DSISR: 00000040 IRQMASK: 3
[    0.000000][    C0] GPR00: 0000000030093284 0000000031c13bd0 0000000030192600 8000000000000000
[    0.000000][    C0] GPR04: 0000000000030000 0000000000000300 0000000030144f46 0000000030144f38
[    0.000000][    C0] GPR08: 0000000000000010 0006030201000038 0000000000000000 0000000000000001
[    0.000000][    C0] GPR12: 0000000024022424 c000000002580000 0000000000000000 0000000000000000
[    0.000000][    C0] GPR16: 0000000031c10000 0000000000000001 0000000000000008 0000000000800000
[    0.000000][    C0] GPR20: ffffffffffffff80 0000000030608ea8 0000000030145427 00000000301453e8
[    0.000000][    C0] GPR24: 0000000030145312 000000003014548f 00000000301454bd 0000000030608118
[    0.000000][    C0] GPR28: 0000000000000038 0006030201000000 0000000000000003 0000000030608e08
[    0.000000][    C0] NIP [0000000030093290] 0x30093290
[    0.000000][    C0] LR [0000000030093284] 0x30093284
[    0.000000][    C0] Call Trace:
[    0.000000][    C0] Instruction dump:
[    0.000000][    C0] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
[    0.000000][    C0] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
[   34.458079299,0] OPAL: Reboot requested due to Platform error.
[   34.458349474,3] OPAL: Reboot requested due to Platform error.[   34.458589592,5] Unable to log error
[   34.458756741,2] NVRAM: Failed to load
[    0.000000][    C0] opal: Reboot type 1 not supported for Unrecoverable Machine Check exception
[    0.000000][    C0] Kernel panic legoater@virtlab1024:~/qemu/qemu-ppc-boot.git$
Re: [PATCH 0/4] target/ppc: Catch invalid real address accesses
Posted by Cédric Le Goater 10 months, 3 weeks ago
On 6/23/23 14:37, Cédric Le Goater wrote:
> On 6/23/23 11:10, Peter Maydell wrote:
>> On Fri, 23 Jun 2023 at 09:21, Nicholas Piggin <npiggin@gmail.com> wrote:
>>>
>>> ppc has always silently ignored access to real (physical) addresses
>>> with nothing behind it, which can make debugging difficult at times.
>>>
>>> It looks like the way to handle this is implement the transaction
>>> failed call, which most target architectures do. Notably not x86
>>> though, I wonder why?
>>
>> Much of this is historical legacy. QEMU originally had no
>> concept of "the system outside the CPU returns some kind
>> of bus error and the CPU raises an exception for it".
>> This is turn is (I think) because the x86 PC doesn't do
>> that: you always get back some kind of response, I think
>> -1 on reads and writes ignored. We added the do_transaction_failed
>> hook largely because we wanted it to give more accurate
>> emulation of this kind of thing on Arm, but as usual with new
>> facilities we left the other architectures to do it themselves
>> if they wanted -- by default the behaviour remained the same.
>> Some architectures have picked it up; some haven't.
>>
>> The main reason it's a bit of a pain to turn the correct
>> handling on is because often boards don't actually implement
>> all the devices they're supposed to. For a pile of legacy Arm
>> boards, especially where we didn't have good test images,
>> we use the machine flag ignore_memory_transaction_failures to
>> retain the legacy behaviour. (This isn't great because it's
>> pretty much going to mean we have that flag set on those
>> boards forever because nobody is going to care enough to
>> investigate and test.)
>>
>>> Other question is, sometimes I guess it's nice to avoid crashing in
>>> order to try to quickly get past some unimplemented MMIO. Maybe a
>>> command line option or something could turn it off? It should
>>> probably be a QEMU-wide option if so, so that shouldn't hold this
>>> series up, I can propose a option for that if anybody is worried
>>> about it.
>>
>> I would not recommend going any further than maybe setting the
>> ignore_memory_transaction_failures flag for boards you don't
>> care about. (But in an ideal world, don't set it and deal with
>> any bug reports by implementing stub versions of missing devices.
>> Depends how confident you are in your test coverage.)
> 
> It seems it broke the "mac99" and  powernv10 machines, using the
> qemu-ppc-boot images which are mostly buildroot. See below for logs.
> 
> Adding Mark for further testing on Mac OS.
  

Mac OS 9.2 fails to boot with a popup saying :
        
         Sorry, a system error occured.
         "Sound Manager"
           address error
         To temporarily turn off extensions, restart and
         hold down the shift key


Darwin and Mac OSX look OK.


Thanks,

C.



Re: [PATCH 0/4] target/ppc: Catch invalid real address accesses
Posted by Mark Cave-Ayland 10 months, 3 weeks ago
On 26/06/2023 14:35, Cédric Le Goater wrote:

> On 6/23/23 14:37, Cédric Le Goater wrote:
>> On 6/23/23 11:10, Peter Maydell wrote:
>>> On Fri, 23 Jun 2023 at 09:21, Nicholas Piggin <npiggin@gmail.com> wrote:
>>>>
>>>> ppc has always silently ignored access to real (physical) addresses
>>>> with nothing behind it, which can make debugging difficult at times.
>>>>
>>>> It looks like the way to handle this is implement the transaction
>>>> failed call, which most target architectures do. Notably not x86
>>>> though, I wonder why?
>>>
>>> Much of this is historical legacy. QEMU originally had no
>>> concept of "the system outside the CPU returns some kind
>>> of bus error and the CPU raises an exception for it".
>>> This is turn is (I think) because the x86 PC doesn't do
>>> that: you always get back some kind of response, I think
>>> -1 on reads and writes ignored. We added the do_transaction_failed
>>> hook largely because we wanted it to give more accurate
>>> emulation of this kind of thing on Arm, but as usual with new
>>> facilities we left the other architectures to do it themselves
>>> if they wanted -- by default the behaviour remained the same.
>>> Some architectures have picked it up; some haven't.
>>>
>>> The main reason it's a bit of a pain to turn the correct
>>> handling on is because often boards don't actually implement
>>> all the devices they're supposed to. For a pile of legacy Arm
>>> boards, especially where we didn't have good test images,
>>> we use the machine flag ignore_memory_transaction_failures to
>>> retain the legacy behaviour. (This isn't great because it's
>>> pretty much going to mean we have that flag set on those
>>> boards forever because nobody is going to care enough to
>>> investigate and test.)
>>>
>>>> Other question is, sometimes I guess it's nice to avoid crashing in
>>>> order to try to quickly get past some unimplemented MMIO. Maybe a
>>>> command line option or something could turn it off? It should
>>>> probably be a QEMU-wide option if so, so that shouldn't hold this
>>>> series up, I can propose a option for that if anybody is worried
>>>> about it.
>>>
>>> I would not recommend going any further than maybe setting the
>>> ignore_memory_transaction_failures flag for boards you don't
>>> care about. (But in an ideal world, don't set it and deal with
>>> any bug reports by implementing stub versions of missing devices.
>>> Depends how confident you are in your test coverage.)
>>
>> It seems it broke the "mac99" and  powernv10 machines, using the
>> qemu-ppc-boot images which are mostly buildroot. See below for logs.
>>
>> Adding Mark for further testing on Mac OS.
> 
> 
> Mac OS 9.2 fails to boot with a popup saying :
>          Sorry, a system error occured.
>          "Sound Manager"
>            address error
>          To temporarily turn off extensions, restart and
>          hold down the shift key
> 
> 
> Darwin and Mac OSX look OK.

My guess would be that MacOS 9.2 is trying to access the sound chip registers which 
isn't implemented in QEMU for the moment (I have a separate screamer branch 
available, but it's not ready for primetime yet). In theory they shouldn't be 
accessed at all because the sound device isn't present in the OpenBIOS device tree, 
but this is all fairly old stuff.

Does implementing the sound registers using a dummy device help at all?


diff --git a/hw/misc/macio/macio.c b/hw/misc/macio/macio.c
index 265c0bbd8d..e55f938da7 100644
--- a/hw/misc/macio/macio.c
+++ b/hw/misc/macio/macio.c
@@ -26,6 +26,7 @@
  #include "qemu/osdep.h"
  #include "qapi/error.h"
  #include "qemu/module.h"
+#include "hw/misc/unimp.h"
  #include "hw/misc/macio/cuda.h"
  #include "hw/pci/pci.h"
  #include "hw/ppc/mac_dbdma.h"
@@ -94,6 +95,7 @@ static bool macio_common_realize(PCIDevice *d, Error **errp)
  {
      MacIOState *s = MACIO(d);
      SysBusDevice *sbd;
+    DeviceState *dev;

      if (!qdev_realize(DEVICE(&s->dbdma), BUS(&s->macio_bus), errp)) {
          return false;
@@ -102,6 +104,14 @@ static bool macio_common_realize(PCIDevice *d, Error **errp)
      memory_region_add_subregion(&s->bar, 0x08000,
                                  sysbus_mmio_get_region(sbd, 0));

+    dev = qdev_new(TYPE_UNIMPLEMENTED_DEVICE);
+    qdev_prop_set_string(dev, "name", "screamer");
+    qdev_prop_set_uint64(dev, "size", 0x1000);
+    sysbus_realize_and_unref(SYS_BUS_DEVICE(dev), &error_fatal);
+    sbd = SYS_BUS_DEVICE(dev);
+    memory_region_add_subregion(&s->bar, 0x14000,
+                                sysbus_mmio_get_region(sbd, 0));
+
      qdev_prop_set_uint32(DEVICE(&s->escc), "disabled", 0);
      qdev_prop_set_uint32(DEVICE(&s->escc), "frequency", ESCC_CLOCK);
      qdev_prop_set_uint32(DEVICE(&s->escc), "it_shift", 4);
diff --git a/include/hw/misc/macio/macio.h b/include/hw/misc/macio/macio.h
index 86df2c2b60..1894178a68 100644
--- a/include/hw/misc/macio/macio.h
+++ b/include/hw/misc/macio/macio.h
@@ -109,6 +109,7 @@ struct MacIOState {
      PMUState pmu;
      DBDMAState dbdma;
      ESCCState escc;
+    MemoryRegion screamer;
      uint64_t frequency;
  };



ATB,

Mark.

Re: [PATCH 0/4] target/ppc: Catch invalid real address accesses
Posted by Cédric Le Goater 10 months, 3 weeks ago
>> Mac OS 9.2 fails to boot with a popup saying :
>>          Sorry, a system error occured.
>>          "Sound Manager"
>>            address error
>>          To temporarily turn off extensions, restart and
>>          hold down the shift key
>>
>>
>> Darwin and Mac OSX look OK.
> 
> My guess would be that MacOS 9.2 is trying to access the sound chip registers which isn't implemented in QEMU for the moment (I have a separate screamer branch available, but it's not ready for primetime yet). In theory they shouldn't be accessed at all because the sound device isn't present in the OpenBIOS device tree, but this is all fairly old stuff.
> 
> Does implementing the sound registers using a dummy device help at all?

Nope. OS 9 loops earlier (little black/white disk spinning).

Thanks,

C.



  
> 
> 
> diff --git a/hw/misc/macio/macio.c b/hw/misc/macio/macio.c
> index 265c0bbd8d..e55f938da7 100644
> --- a/hw/misc/macio/macio.c
> +++ b/hw/misc/macio/macio.c
> @@ -26,6 +26,7 @@
>   #include "qemu/osdep.h"
>   #include "qapi/error.h"
>   #include "qemu/module.h"
> +#include "hw/misc/unimp.h"
>   #include "hw/misc/macio/cuda.h"
>   #include "hw/pci/pci.h"
>   #include "hw/ppc/mac_dbdma.h"
> @@ -94,6 +95,7 @@ static bool macio_common_realize(PCIDevice *d, Error **errp)
>   {
>       MacIOState *s = MACIO(d);
>       SysBusDevice *sbd;
> +    DeviceState *dev;
> 
>       if (!qdev_realize(DEVICE(&s->dbdma), BUS(&s->macio_bus), errp)) {
>           return false;
> @@ -102,6 +104,14 @@ static bool macio_common_realize(PCIDevice *d, Error **errp)
>       memory_region_add_subregion(&s->bar, 0x08000,
>                                   sysbus_mmio_get_region(sbd, 0));
> 
> +    dev = qdev_new(TYPE_UNIMPLEMENTED_DEVICE);
> +    qdev_prop_set_string(dev, "name", "screamer");
> +    qdev_prop_set_uint64(dev, "size", 0x1000);
> +    sysbus_realize_and_unref(SYS_BUS_DEVICE(dev), &error_fatal);
> +    sbd = SYS_BUS_DEVICE(dev);
> +    memory_region_add_subregion(&s->bar, 0x14000,
> +                                sysbus_mmio_get_region(sbd, 0));
> +
>       qdev_prop_set_uint32(DEVICE(&s->escc), "disabled", 0);
>       qdev_prop_set_uint32(DEVICE(&s->escc), "frequency", ESCC_CLOCK);
>       qdev_prop_set_uint32(DEVICE(&s->escc), "it_shift", 4);
> diff --git a/include/hw/misc/macio/macio.h b/include/hw/misc/macio/macio.h
> index 86df2c2b60..1894178a68 100644
> --- a/include/hw/misc/macio/macio.h
> +++ b/include/hw/misc/macio/macio.h
> @@ -109,6 +109,7 @@ struct MacIOState {
>       PMUState pmu;
>       DBDMAState dbdma;
>       ESCCState escc;
> +    MemoryRegion screamer;
>       uint64_t frequency;
>   };
> 
> 
> 
> ATB,
> 
> Mark.


Re: [PATCH 0/4] target/ppc: Catch invalid real address accesses
Posted by Mark Cave-Ayland 10 months, 3 weeks ago
On 27/06/2023 13:03, Cédric Le Goater wrote:

>>> Mac OS 9.2 fails to boot with a popup saying :
>>>          Sorry, a system error occured.
>>>          "Sound Manager"
>>>            address error
>>>          To temporarily turn off extensions, restart and
>>>          hold down the shift key
>>>
>>>
>>> Darwin and Mac OSX look OK.
>>
>> My guess would be that MacOS 9.2 is trying to access the sound chip registers which 
>> isn't implemented in QEMU for the moment (I have a separate screamer branch 
>> available, but it's not ready for primetime yet). In theory they shouldn't be 
>> accessed at all because the sound device isn't present in the OpenBIOS device tree, 
>> but this is all fairly old stuff.
>>
>> Does implementing the sound registers using a dummy device help at all?
> 
> Nope. OS 9 loops earlier (little black/white disk spinning).
> 
> Thanks,
> 
> C.

Hmmm that's annoying. Another one to add to the TODO list then...


ATB,

Mark.


Re: [PATCH 0/4] target/ppc: Catch invalid real address accesses
Posted by Howard Spoelstra 10 months, 3 weeks ago
On Tue, Jun 27, 2023 at 10:15 AM Mark Cave-Ayland <
mark.cave-ayland@ilande.co.uk> wrote:

> On 26/06/2023 14:35, Cédric Le Goater wrote:
>
> > On 6/23/23 14:37, Cédric Le Goater wrote:
> >> On 6/23/23 11:10, Peter Maydell wrote:
> >>> On Fri, 23 Jun 2023 at 09:21, Nicholas Piggin <npiggin@gmail.com>
> wrote:
> >>>>
> >>>> ppc has always silently ignored access to real (physical) addresses
> >>>> with nothing behind it, which can make debugging difficult at times.
> >>>>
> >>>> It looks like the way to handle this is implement the transaction
> >>>> failed call, which most target architectures do. Notably not x86
> >>>> though, I wonder why?
> >>>
> >>> Much of this is historical legacy. QEMU originally had no
> >>> concept of "the system outside the CPU returns some kind
> >>> of bus error and the CPU raises an exception for it".
> >>> This is turn is (I think) because the x86 PC doesn't do
> >>> that: you always get back some kind of response, I think
> >>> -1 on reads and writes ignored. We added the do_transaction_failed
> >>> hook largely because we wanted it to give more accurate
> >>> emulation of this kind of thing on Arm, but as usual with new
> >>> facilities we left the other architectures to do it themselves
> >>> if they wanted -- by default the behaviour remained the same.
> >>> Some architectures have picked it up; some haven't.
> >>>
> >>> The main reason it's a bit of a pain to turn the correct
> >>> handling on is because often boards don't actually implement
> >>> all the devices they're supposed to. For a pile of legacy Arm
> >>> boards, especially where we didn't have good test images,
> >>> we use the machine flag ignore_memory_transaction_failures to
> >>> retain the legacy behaviour. (This isn't great because it's
> >>> pretty much going to mean we have that flag set on those
> >>> boards forever because nobody is going to care enough to
> >>> investigate and test.)
> >>>
> >>>> Other question is, sometimes I guess it's nice to avoid crashing in
> >>>> order to try to quickly get past some unimplemented MMIO. Maybe a
> >>>> command line option or something could turn it off? It should
> >>>> probably be a QEMU-wide option if so, so that shouldn't hold this
> >>>> series up, I can propose a option for that if anybody is worried
> >>>> about it.
> >>>
> >>> I would not recommend going any further than maybe setting the
> >>> ignore_memory_transaction_failures flag for boards you don't
> >>> care about. (But in an ideal world, don't set it and deal with
> >>> any bug reports by implementing stub versions of missing devices.
> >>> Depends how confident you are in your test coverage.)
> >>
> >> It seems it broke the "mac99" and  powernv10 machines, using the
> >> qemu-ppc-boot images which are mostly buildroot. See below for logs.
> >>
> >> Adding Mark for further testing on Mac OS.
> >
> >
> > Mac OS 9.2 fails to boot with a popup saying :
> >          Sorry, a system error occured.
> >          "Sound Manager"
> >            address error
> >          To temporarily turn off extensions, restart and
> >          hold down the shift key
> >
> >
> > Darwin and Mac OSX look OK.
>
> My guess would be that MacOS 9.2 is trying to access the sound chip
> registers which
> isn't implemented in QEMU for the moment (I have a separate screamer
> branch
> available, but it's not ready for primetime yet). In theory they shouldn't
> be
> accessed at all because the sound device isn't present in the OpenBIOS
> device tree,
> but this is all fairly old stuff.
>
> Does implementing the sound registers using a dummy device help at all?
>
>
My uneducated guess is that you stumbled on a longstanding, but
intermittently occurring, issue specific to Mac OS 9.2 related to sound
support over USB in Apple monitors.
I believe It is not fixed by the patch set from the 23 of june, I still get
system errors when running Mac OS 9.2 with the mac99 machine after applying
them.
Mac OS 9.2 has required mac99,via=pmu for a long time now to always boot
successfully. (while 9.0.4 requires mac99 to boot, due to an undiagnosed
OHCI USB problem with the specific drivers that ship with it.)  ;-)

Best,
Howard


>
> diff --git a/hw/misc/macio/macio.c b/hw/misc/macio/macio.c
> index 265c0bbd8d..e55f938da7 100644
> --- a/hw/misc/macio/macio.c
> +++ b/hw/misc/macio/macio.c
> @@ -26,6 +26,7 @@
>   #include "qemu/osdep.h"
>   #include "qapi/error.h"
>   #include "qemu/module.h"
> +#include "hw/misc/unimp.h"
>   #include "hw/misc/macio/cuda.h"
>   #include "hw/pci/pci.h"
>   #include "hw/ppc/mac_dbdma.h"
> @@ -94,6 +95,7 @@ static bool macio_common_realize(PCIDevice *d, Error
> **errp)
>   {
>       MacIOState *s = MACIO(d);
>       SysBusDevice *sbd;
> +    DeviceState *dev;
>
>       if (!qdev_realize(DEVICE(&s->dbdma), BUS(&s->macio_bus), errp)) {
>           return false;
> @@ -102,6 +104,14 @@ static bool macio_common_realize(PCIDevice *d, Error
> **errp)
>       memory_region_add_subregion(&s->bar, 0x08000,
>                                   sysbus_mmio_get_region(sbd, 0));
>
> +    dev = qdev_new(TYPE_UNIMPLEMENTED_DEVICE);
> +    qdev_prop_set_string(dev, "name", "screamer");
> +    qdev_prop_set_uint64(dev, "size", 0x1000);
> +    sysbus_realize_and_unref(SYS_BUS_DEVICE(dev), &error_fatal);
> +    sbd = SYS_BUS_DEVICE(dev);
> +    memory_region_add_subregion(&s->bar, 0x14000,
> +                                sysbus_mmio_get_region(sbd, 0));
> +
>       qdev_prop_set_uint32(DEVICE(&s->escc), "disabled", 0);
>       qdev_prop_set_uint32(DEVICE(&s->escc), "frequency", ESCC_CLOCK);
>       qdev_prop_set_uint32(DEVICE(&s->escc), "it_shift", 4);
> diff --git a/include/hw/misc/macio/macio.h b/include/hw/misc/macio/macio.h
> index 86df2c2b60..1894178a68 100644
> --- a/include/hw/misc/macio/macio.h
> +++ b/include/hw/misc/macio/macio.h
> @@ -109,6 +109,7 @@ struct MacIOState {
>       PMUState pmu;
>       DBDMAState dbdma;
>       ESCCState escc;
> +    MemoryRegion screamer;
>       uint64_t frequency;
>   };
>
>
>
> ATB,
>
> Mark.
>
>
Re: [PATCH 0/4] target/ppc: Catch invalid real address accesses
Posted by Mark Cave-Ayland 10 months, 3 weeks ago
On 27/06/2023 11:28, Howard Spoelstra wrote:

> On Tue, Jun 27, 2023 at 10:15 AM Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk 
> <mailto:mark.cave-ayland@ilande.co.uk>> wrote:
> 
>     On 26/06/2023 14:35, Cédric Le Goater wrote:
> 
>      > On 6/23/23 14:37, Cédric Le Goater wrote:
>      >> On 6/23/23 11:10, Peter Maydell wrote:
>      >>> On Fri, 23 Jun 2023 at 09:21, Nicholas Piggin <npiggin@gmail.com
>     <mailto:npiggin@gmail.com>> wrote:
>      >>>>
>      >>>> ppc has always silently ignored access to real (physical) addresses
>      >>>> with nothing behind it, which can make debugging difficult at times.
>      >>>>
>      >>>> It looks like the way to handle this is implement the transaction
>      >>>> failed call, which most target architectures do. Notably not x86
>      >>>> though, I wonder why?
>      >>>
>      >>> Much of this is historical legacy. QEMU originally had no
>      >>> concept of "the system outside the CPU returns some kind
>      >>> of bus error and the CPU raises an exception for it".
>      >>> This is turn is (I think) because the x86 PC doesn't do
>      >>> that: you always get back some kind of response, I think
>      >>> -1 on reads and writes ignored. We added the do_transaction_failed
>      >>> hook largely because we wanted it to give more accurate
>      >>> emulation of this kind of thing on Arm, but as usual with new
>      >>> facilities we left the other architectures to do it themselves
>      >>> if they wanted -- by default the behaviour remained the same.
>      >>> Some architectures have picked it up; some haven't.
>      >>>
>      >>> The main reason it's a bit of a pain to turn the correct
>      >>> handling on is because often boards don't actually implement
>      >>> all the devices they're supposed to. For a pile of legacy Arm
>      >>> boards, especially where we didn't have good test images,
>      >>> we use the machine flag ignore_memory_transaction_failures to
>      >>> retain the legacy behaviour. (This isn't great because it's
>      >>> pretty much going to mean we have that flag set on those
>      >>> boards forever because nobody is going to care enough to
>      >>> investigate and test.)
>      >>>
>      >>>> Other question is, sometimes I guess it's nice to avoid crashing in
>      >>>> order to try to quickly get past some unimplemented MMIO. Maybe a
>      >>>> command line option or something could turn it off? It should
>      >>>> probably be a QEMU-wide option if so, so that shouldn't hold this
>      >>>> series up, I can propose a option for that if anybody is worried
>      >>>> about it.
>      >>>
>      >>> I would not recommend going any further than maybe setting the
>      >>> ignore_memory_transaction_failures flag for boards you don't
>      >>> care about. (But in an ideal world, don't set it and deal with
>      >>> any bug reports by implementing stub versions of missing devices.
>      >>> Depends how confident you are in your test coverage.)
>      >>
>      >> It seems it broke the "mac99" and  powernv10 machines, using the
>      >> qemu-ppc-boot images which are mostly buildroot. See below for logs.
>      >>
>      >> Adding Mark for further testing on Mac OS.
>      >
>      >
>      > Mac OS 9.2 fails to boot with a popup saying :
>      >          Sorry, a system error occured.
>      >          "Sound Manager"
>      >            address error
>      >          To temporarily turn off extensions, restart and
>      >          hold down the shift key
>      >
>      >
>      > Darwin and Mac OSX look OK.
> 
>     My guess would be that MacOS 9.2 is trying to access the sound chip registers which
>     isn't implemented in QEMU for the moment (I have a separate screamer branch
>     available, but it's not ready for primetime yet). In theory they shouldn't be
>     accessed at all because the sound device isn't present in the OpenBIOS device tree,
>     but this is all fairly old stuff.
> 
>     Does implementing the sound registers using a dummy device help at all?
> 
> 
> My uneducated guess is that you stumbled on a longstanding, but intermittently 
> occurring, issue specific to Mac OS 9.2 related to sound support over USB in Apple 
> monitors.

I'm not sure I understand this: are there non-standard command line options being 
used here other than "qemu-system-ppc -M mac99 -cdrom macos92.iso -boot d"?

> I believe It is not fixed by the patch set from the 23 of june, I still get system 
> errors when running Mac OS 9.2 with the mac99 machine after applying them.
> Mac OS 9.2 has required mac99,via=pmu for a long time now to always boot 
> successfully. (while 9.0.4 requires mac99 to boot, due to an undiagnosed OHCI USB 
> problem with the specific drivers that ship with it.)  ;-)

I always test MacOS 9.2 boot both with and without via=pmu for my OpenBIOS tests, so 
I'd expect this to work unless a regression has slipped in?


ATB,

Mark.


Re: [PATCH 0/4] target/ppc: Catch invalid real address accesses
Posted by Howard Spoelstra 10 months, 3 weeks ago
On Tue, Jun 27, 2023 at 1:24 PM Mark Cave-Ayland <
mark.cave-ayland@ilande.co.uk> wrote:

> On 27/06/2023 11:28, Howard Spoelstra wrote:
>
> > On Tue, Jun 27, 2023 at 10:15 AM Mark Cave-Ayland <
> mark.cave-ayland@ilande.co.uk
> > <mailto:mark.cave-ayland@ilande.co.uk>> wrote:
> >
> >     On 26/06/2023 14:35, Cédric Le Goater wrote:
> >
> >      > On 6/23/23 14:37, Cédric Le Goater wrote:
> >      >> On 6/23/23 11:10, Peter Maydell wrote:
> >      >>> On Fri, 23 Jun 2023 at 09:21, Nicholas Piggin <
> npiggin@gmail.com
> >     <mailto:npiggin@gmail.com>> wrote:
> >      >>>>
> >      >>>> ppc has always silently ignored access to real (physical)
> addresses
> >      >>>> with nothing behind it, which can make debugging difficult at
> times.
> >      >>>>
> >      >>>> It looks like the way to handle this is implement the
> transaction
> >      >>>> failed call, which most target architectures do. Notably not
> x86
> >      >>>> though, I wonder why?
> >      >>>
> >      >>> Much of this is historical legacy. QEMU originally had no
> >      >>> concept of "the system outside the CPU returns some kind
> >      >>> of bus error and the CPU raises an exception for it".
> >      >>> This is turn is (I think) because the x86 PC doesn't do
> >      >>> that: you always get back some kind of response, I think
> >      >>> -1 on reads and writes ignored. We added the
> do_transaction_failed
> >      >>> hook largely because we wanted it to give more accurate
> >      >>> emulation of this kind of thing on Arm, but as usual with new
> >      >>> facilities we left the other architectures to do it themselves
> >      >>> if they wanted -- by default the behaviour remained the same.
> >      >>> Some architectures have picked it up; some haven't.
> >      >>>
> >      >>> The main reason it's a bit of a pain to turn the correct
> >      >>> handling on is because often boards don't actually implement
> >      >>> all the devices they're supposed to. For a pile of legacy Arm
> >      >>> boards, especially where we didn't have good test images,
> >      >>> we use the machine flag ignore_memory_transaction_failures to
> >      >>> retain the legacy behaviour. (This isn't great because it's
> >      >>> pretty much going to mean we have that flag set on those
> >      >>> boards forever because nobody is going to care enough to
> >      >>> investigate and test.)
> >      >>>
> >      >>>> Other question is, sometimes I guess it's nice to avoid
> crashing in
> >      >>>> order to try to quickly get past some unimplemented MMIO.
> Maybe a
> >      >>>> command line option or something could turn it off? It should
> >      >>>> probably be a QEMU-wide option if so, so that shouldn't hold
> this
> >      >>>> series up, I can propose a option for that if anybody is
> worried
> >      >>>> about it.
> >      >>>
> >      >>> I would not recommend going any further than maybe setting the
> >      >>> ignore_memory_transaction_failures flag for boards you don't
> >      >>> care about. (But in an ideal world, don't set it and deal with
> >      >>> any bug reports by implementing stub versions of missing
> devices.
> >      >>> Depends how confident you are in your test coverage.)
> >      >>
> >      >> It seems it broke the "mac99" and  powernv10 machines, using the
> >      >> qemu-ppc-boot images which are mostly buildroot. See below for
> logs.
> >      >>
> >      >> Adding Mark for further testing on Mac OS.
> >      >
> >      >
> >      > Mac OS 9.2 fails to boot with a popup saying :
> >      >          Sorry, a system error occured.
> >      >          "Sound Manager"
> >      >            address error
> >      >          To temporarily turn off extensions, restart and
> >      >          hold down the shift key
> >      >
> >      >
> >      > Darwin and Mac OSX look OK.
> >
> >     My guess would be that MacOS 9.2 is trying to access the sound chip
> registers which
> >     isn't implemented in QEMU for the moment (I have a separate screamer
> branch
> >     available, but it's not ready for primetime yet). In theory they
> shouldn't be
> >     accessed at all because the sound device isn't present in the
> OpenBIOS device tree,
> >     but this is all fairly old stuff.
> >
> >     Does implementing the sound registers using a dummy device help at
> all?
> >
> >
> > My uneducated guess is that you stumbled on a longstanding, but
> intermittently
> > occurring, issue specific to Mac OS 9.2 related to sound support over
> USB in Apple
> > monitors.
>
> I'm not sure I understand this: are there non-standard command line
> options being
> used here other than "qemu-system-ppc -M mac99 -cdrom macos92.iso -boot d"?
>


It must be my windows host ;-)

qemu-system-ppc.exe -M mac99,via=pmu -cdrom C:\mac-iso\9.2.2.iso -boot d -L
pc-bios
crashes Mac OS with an address error. (with unpatched and patched builds).

qemu-system-ppc.exe -M mac99 -hda C:\mac-hd\9.2.2-clean.img -boot c -L
pc-bios sometimes crashes with an illegal instruction.

qemu-system-ppc.exe -M mac99,via=pmu -hda C:\mac-hd\9.2.2-clean.img -boot c
-L pc-bios sometimes crashes with Sound manager address error.
(with both patched and non-patched versions).

Best,
Howard



>
> > I believe It is not fixed by the patch set from the 23 of june, I still
> get system
> > errors when running Mac OS 9.2 with the mac99 machine after applying
> them.
> > Mac OS 9.2 has required mac99,via=pmu for a long time now to always boot
> > successfully. (while 9.0.4 requires mac99 to boot, due to an undiagnosed
> OHCI USB
> > problem with the specific drivers that ship with it.)  ;-)
>
> I always test MacOS 9.2 boot both with and without via=pmu for my OpenBIOS
> tests, so
> I'd expect this to work unless a regression has slipped in?
>
>
> ATB,
>
> Mark.
>
>
Re: [PATCH 0/4] target/ppc: Catch invalid real address accesses
Posted by Cédric Le Goater 10 months, 3 weeks ago
On 6/27/23 14:05, Howard Spoelstra wrote:
> 
> 
> On Tue, Jun 27, 2023 at 1:24 PM Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk <mailto:mark.cave-ayland@ilande.co.uk>> wrote:
> 
>     On 27/06/2023 11:28, Howard Spoelstra wrote:
> 
>      > On Tue, Jun 27, 2023 at 10:15 AM Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk <mailto:mark.cave-ayland@ilande.co.uk>
>      > <mailto:mark.cave-ayland@ilande.co.uk <mailto:mark.cave-ayland@ilande.co.uk>>> wrote:
>      >
>      >     On 26/06/2023 14:35, Cédric Le Goater wrote:
>      >
>      >      > On 6/23/23 14:37, Cédric Le Goater wrote:
>      >      >> On 6/23/23 11:10, Peter Maydell wrote:
>      >      >>> On Fri, 23 Jun 2023 at 09:21, Nicholas Piggin <npiggin@gmail.com <mailto:npiggin@gmail.com>
>      >     <mailto:npiggin@gmail.com <mailto:npiggin@gmail.com>>> wrote:
>      >      >>>>
>      >      >>>> ppc has always silently ignored access to real (physical) addresses
>      >      >>>> with nothing behind it, which can make debugging difficult at times.
>      >      >>>>
>      >      >>>> It looks like the way to handle this is implement the transaction
>      >      >>>> failed call, which most target architectures do. Notably not x86
>      >      >>>> though, I wonder why?
>      >      >>>
>      >      >>> Much of this is historical legacy. QEMU originally had no
>      >      >>> concept of "the system outside the CPU returns some kind
>      >      >>> of bus error and the CPU raises an exception for it".
>      >      >>> This is turn is (I think) because the x86 PC doesn't do
>      >      >>> that: you always get back some kind of response, I think
>      >      >>> -1 on reads and writes ignored. We added the do_transaction_failed
>      >      >>> hook largely because we wanted it to give more accurate
>      >      >>> emulation of this kind of thing on Arm, but as usual with new
>      >      >>> facilities we left the other architectures to do it themselves
>      >      >>> if they wanted -- by default the behaviour remained the same.
>      >      >>> Some architectures have picked it up; some haven't.
>      >      >>>
>      >      >>> The main reason it's a bit of a pain to turn the correct
>      >      >>> handling on is because often boards don't actually implement
>      >      >>> all the devices they're supposed to. For a pile of legacy Arm
>      >      >>> boards, especially where we didn't have good test images,
>      >      >>> we use the machine flag ignore_memory_transaction_failures to
>      >      >>> retain the legacy behaviour. (This isn't great because it's
>      >      >>> pretty much going to mean we have that flag set on those
>      >      >>> boards forever because nobody is going to care enough to
>      >      >>> investigate and test.)
>      >      >>>
>      >      >>>> Other question is, sometimes I guess it's nice to avoid crashing in
>      >      >>>> order to try to quickly get past some unimplemented MMIO. Maybe a
>      >      >>>> command line option or something could turn it off? It should
>      >      >>>> probably be a QEMU-wide option if so, so that shouldn't hold this
>      >      >>>> series up, I can propose a option for that if anybody is worried
>      >      >>>> about it.
>      >      >>>
>      >      >>> I would not recommend going any further than maybe setting the
>      >      >>> ignore_memory_transaction_failures flag for boards you don't
>      >      >>> care about. (But in an ideal world, don't set it and deal with
>      >      >>> any bug reports by implementing stub versions of missing devices.
>      >      >>> Depends how confident you are in your test coverage.)
>      >      >>
>      >      >> It seems it broke the "mac99" and  powernv10 machines, using the
>      >      >> qemu-ppc-boot images which are mostly buildroot. See below for logs.
>      >      >>
>      >      >> Adding Mark for further testing on Mac OS.
>      >      >
>      >      >
>      >      > Mac OS 9.2 fails to boot with a popup saying :
>      >      >          Sorry, a system error occured.
>      >      >          "Sound Manager"
>      >      >            address error
>      >      >          To temporarily turn off extensions, restart and
>      >      >          hold down the shift key
>      >      >
>      >      >
>      >      > Darwin and Mac OSX look OK.
>      >
>      >     My guess would be that MacOS 9.2 is trying to access the sound chip registers which
>      >     isn't implemented in QEMU for the moment (I have a separate screamer branch
>      >     available, but it's not ready for primetime yet). In theory they shouldn't be
>      >     accessed at all because the sound device isn't present in the OpenBIOS device tree,
>      >     but this is all fairly old stuff.
>      >
>      >     Does implementing the sound registers using a dummy device help at all?
>      >
>      >
>      > My uneducated guess is that you stumbled on a longstanding, but intermittently
>      > occurring, issue specific to Mac OS 9.2 related to sound support over USB in Apple
>      > monitors.
> 
>     I'm not sure I understand this: are there non-standard command line options being
>     used here other than "qemu-system-ppc -M mac99 -cdrom macos92.iso -boot d"?
> 
> 
> 
> It must be my windows host ;-)
> 
> qemu-system-ppc.exe -M mac99,via=pmu -cdrom C:\mac-iso\9.2.2.iso -boot d -L pc-bios
> crashes Mac OS with an address error. (with unpatched and patched builds).

Same on Linux. I get an invalid opcode. QEMU 7.2 work fine though.

C.



> 
> qemu-system-ppc.exe -M mac99 -hda C:\mac-hd\9.2.2-clean.img -boot c -L pc-bios sometimes crashes with an illegal instruction.
> 
> qemu-system-ppc.exe -M mac99,via=pmu -hda C:\mac-hd\9.2.2-clean.img -boot c -L pc-bios sometimes crashes with Sound manager address error.
> (with both patched and non-patched versions).
> 
> Best,
> Howard
> 
> 
>      > I believe It is not fixed by the patch set from the 23 of june, I still get system
>      > errors when running Mac OS 9.2 with the mac99 machine after applying them.
>      > Mac OS 9.2 has required mac99,via=pmu for a long time now to always boot
>      > successfully. (while 9.0.4 requires mac99 to boot, due to an undiagnosed OHCI USB
>      > problem with the specific drivers that ship with it.)  ;-)
> 
>     I always test MacOS 9.2 boot both with and without via=pmu for my OpenBIOS tests, so
>     I'd expect this to work unless a regression has slipped in?
> 
> 
>     ATB,
> 
>     Mark.
> 


Re: [PATCH 0/4] target/ppc: Catch invalid real address accesses
Posted by Mark Cave-Ayland 10 months, 3 weeks ago
On 27/06/2023 13:41, Cédric Le Goater wrote:

> On 6/27/23 14:05, Howard Spoelstra wrote:
>>
>>
>> On Tue, Jun 27, 2023 at 1:24 PM Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk 
>> <mailto:mark.cave-ayland@ilande.co.uk>> wrote:
>>
>>     On 27/06/2023 11:28, Howard Spoelstra wrote:
>>
>>      > On Tue, Jun 27, 2023 at 10:15 AM Mark Cave-Ayland 
>> <mark.cave-ayland@ilande.co.uk <mailto:mark.cave-ayland@ilande.co.uk>
>>      > <mailto:mark.cave-ayland@ilande.co.uk 
>> <mailto:mark.cave-ayland@ilande.co.uk>>> wrote:
>>      >
>>      >     On 26/06/2023 14:35, Cédric Le Goater wrote:
>>      >
>>      >      > On 6/23/23 14:37, Cédric Le Goater wrote:
>>      >      >> On 6/23/23 11:10, Peter Maydell wrote:
>>      >      >>> On Fri, 23 Jun 2023 at 09:21, Nicholas Piggin <npiggin@gmail.com 
>> <mailto:npiggin@gmail.com>
>>      >     <mailto:npiggin@gmail.com <mailto:npiggin@gmail.com>>> wrote:
>>      >      >>>>
>>      >      >>>> ppc has always silently ignored access to real (physical) addresses
>>      >      >>>> with nothing behind it, which can make debugging difficult at times.
>>      >      >>>>
>>      >      >>>> It looks like the way to handle this is implement the transaction
>>      >      >>>> failed call, which most target architectures do. Notably not x86
>>      >      >>>> though, I wonder why?
>>      >      >>>
>>      >      >>> Much of this is historical legacy. QEMU originally had no
>>      >      >>> concept of "the system outside the CPU returns some kind
>>      >      >>> of bus error and the CPU raises an exception for it".
>>      >      >>> This is turn is (I think) because the x86 PC doesn't do
>>      >      >>> that: you always get back some kind of response, I think
>>      >      >>> -1 on reads and writes ignored. We added the do_transaction_failed
>>      >      >>> hook largely because we wanted it to give more accurate
>>      >      >>> emulation of this kind of thing on Arm, but as usual with new
>>      >      >>> facilities we left the other architectures to do it themselves
>>      >      >>> if they wanted -- by default the behaviour remained the same.
>>      >      >>> Some architectures have picked it up; some haven't.
>>      >      >>>
>>      >      >>> The main reason it's a bit of a pain to turn the correct
>>      >      >>> handling on is because often boards don't actually implement
>>      >      >>> all the devices they're supposed to. For a pile of legacy Arm
>>      >      >>> boards, especially where we didn't have good test images,
>>      >      >>> we use the machine flag ignore_memory_transaction_failures to
>>      >      >>> retain the legacy behaviour. (This isn't great because it's
>>      >      >>> pretty much going to mean we have that flag set on those
>>      >      >>> boards forever because nobody is going to care enough to
>>      >      >>> investigate and test.)
>>      >      >>>
>>      >      >>>> Other question is, sometimes I guess it's nice to avoid crashing in
>>      >      >>>> order to try to quickly get past some unimplemented MMIO. Maybe a
>>      >      >>>> command line option or something could turn it off? It should
>>      >      >>>> probably be a QEMU-wide option if so, so that shouldn't hold this
>>      >      >>>> series up, I can propose a option for that if anybody is worried
>>      >      >>>> about it.
>>      >      >>>
>>      >      >>> I would not recommend going any further than maybe setting the
>>      >      >>> ignore_memory_transaction_failures flag for boards you don't
>>      >      >>> care about. (But in an ideal world, don't set it and deal with
>>      >      >>> any bug reports by implementing stub versions of missing devices.
>>      >      >>> Depends how confident you are in your test coverage.)
>>      >      >>
>>      >      >> It seems it broke the "mac99" and  powernv10 machines, using the
>>      >      >> qemu-ppc-boot images which are mostly buildroot. See below for logs.
>>      >      >>
>>      >      >> Adding Mark for further testing on Mac OS.
>>      >      >
>>      >      >
>>      >      > Mac OS 9.2 fails to boot with a popup saying :
>>      >      >          Sorry, a system error occured.
>>      >      >          "Sound Manager"
>>      >      >            address error
>>      >      >          To temporarily turn off extensions, restart and
>>      >      >          hold down the shift key
>>      >      >
>>      >      >
>>      >      > Darwin and Mac OSX look OK.
>>      >
>>      >     My guess would be that MacOS 9.2 is trying to access the sound chip 
>> registers which
>>      >     isn't implemented in QEMU for the moment (I have a separate screamer branch
>>      >     available, but it's not ready for primetime yet). In theory they 
>> shouldn't be
>>      >     accessed at all because the sound device isn't present in the OpenBIOS 
>> device tree,
>>      >     but this is all fairly old stuff.
>>      >
>>      >     Does implementing the sound registers using a dummy device help at all?
>>      >
>>      >
>>      > My uneducated guess is that you stumbled on a longstanding, but intermittently
>>      > occurring, issue specific to Mac OS 9.2 related to sound support over USB in 
>> Apple
>>      > monitors.
>>
>>     I'm not sure I understand this: are there non-standard command line options being
>>     used here other than "qemu-system-ppc -M mac99 -cdrom macos92.iso -boot d"?
>>
>>
>>
>> It must be my windows host ;-)
>>
>> qemu-system-ppc.exe -M mac99,via=pmu -cdrom C:\mac-iso\9.2.2.iso -boot d -L pc-bios
>> crashes Mac OS with an address error. (with unpatched and patched builds).
> 
> Same on Linux. I get an invalid opcode. QEMU 7.2 work fine though.
> 
> C.

That certainly shouldn't happen, and if it worked in 7.2 then there's definitely a 
regression which has crept in there somewhere. I'll try and bisect this at some point 
soon, but feel free to try and beat me ;)


ATB,

Mark.


Re: [PATCH 0/4] target/ppc: Catch invalid real address accesses
Posted by Cédric Le Goater 10 months, 3 weeks ago
On 6/27/23 22:26, Mark Cave-Ayland wrote:
> On 27/06/2023 13:41, Cédric Le Goater wrote:
> 
>> On 6/27/23 14:05, Howard Spoelstra wrote:
>>>
>>>
>>> On Tue, Jun 27, 2023 at 1:24 PM Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk <mailto:mark.cave-ayland@ilande.co.uk>> wrote:
>>>
>>>     On 27/06/2023 11:28, Howard Spoelstra wrote:
>>>
>>>      > On Tue, Jun 27, 2023 at 10:15 AM Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk <mailto:mark.cave-ayland@ilande.co.uk>
>>>      > <mailto:mark.cave-ayland@ilande.co.uk <mailto:mark.cave-ayland@ilande.co.uk>>> wrote:
>>>      >
>>>      >     On 26/06/2023 14:35, Cédric Le Goater wrote:
>>>      >
>>>      >      > On 6/23/23 14:37, Cédric Le Goater wrote:
>>>      >      >> On 6/23/23 11:10, Peter Maydell wrote:
>>>      >      >>> On Fri, 23 Jun 2023 at 09:21, Nicholas Piggin <npiggin@gmail.com <mailto:npiggin@gmail.com>
>>>      >     <mailto:npiggin@gmail.com <mailto:npiggin@gmail.com>>> wrote:
>>>      >      >>>>
>>>      >      >>>> ppc has always silently ignored access to real (physical) addresses
>>>      >      >>>> with nothing behind it, which can make debugging difficult at times.
>>>      >      >>>>
>>>      >      >>>> It looks like the way to handle this is implement the transaction
>>>      >      >>>> failed call, which most target architectures do. Notably not x86
>>>      >      >>>> though, I wonder why?
>>>      >      >>>
>>>      >      >>> Much of this is historical legacy. QEMU originally had no
>>>      >      >>> concept of "the system outside the CPU returns some kind
>>>      >      >>> of bus error and the CPU raises an exception for it".
>>>      >      >>> This is turn is (I think) because the x86 PC doesn't do
>>>      >      >>> that: you always get back some kind of response, I think
>>>      >      >>> -1 on reads and writes ignored. We added the do_transaction_failed
>>>      >      >>> hook largely because we wanted it to give more accurate
>>>      >      >>> emulation of this kind of thing on Arm, but as usual with new
>>>      >      >>> facilities we left the other architectures to do it themselves
>>>      >      >>> if they wanted -- by default the behaviour remained the same.
>>>      >      >>> Some architectures have picked it up; some haven't.
>>>      >      >>>
>>>      >      >>> The main reason it's a bit of a pain to turn the correct
>>>      >      >>> handling on is because often boards don't actually implement
>>>      >      >>> all the devices they're supposed to. For a pile of legacy Arm
>>>      >      >>> boards, especially where we didn't have good test images,
>>>      >      >>> we use the machine flag ignore_memory_transaction_failures to
>>>      >      >>> retain the legacy behaviour. (This isn't great because it's
>>>      >      >>> pretty much going to mean we have that flag set on those
>>>      >      >>> boards forever because nobody is going to care enough to
>>>      >      >>> investigate and test.)
>>>      >      >>>
>>>      >      >>>> Other question is, sometimes I guess it's nice to avoid crashing in
>>>      >      >>>> order to try to quickly get past some unimplemented MMIO. Maybe a
>>>      >      >>>> command line option or something could turn it off? It should
>>>      >      >>>> probably be a QEMU-wide option if so, so that shouldn't hold this
>>>      >      >>>> series up, I can propose a option for that if anybody is worried
>>>      >      >>>> about it.
>>>      >      >>>
>>>      >      >>> I would not recommend going any further than maybe setting the
>>>      >      >>> ignore_memory_transaction_failures flag for boards you don't
>>>      >      >>> care about. (But in an ideal world, don't set it and deal with
>>>      >      >>> any bug reports by implementing stub versions of missing devices.
>>>      >      >>> Depends how confident you are in your test coverage.)
>>>      >      >>
>>>      >      >> It seems it broke the "mac99" and  powernv10 machines, using the
>>>      >      >> qemu-ppc-boot images which are mostly buildroot. See below for logs.
>>>      >      >>
>>>      >      >> Adding Mark for further testing on Mac OS.
>>>      >      >
>>>      >      >
>>>      >      > Mac OS 9.2 fails to boot with a popup saying :
>>>      >      >          Sorry, a system error occured.
>>>      >      >          "Sound Manager"
>>>      >      >            address error
>>>      >      >          To temporarily turn off extensions, restart and
>>>      >      >          hold down the shift key
>>>      >      >
>>>      >      >
>>>      >      > Darwin and Mac OSX look OK.
>>>      >
>>>      >     My guess would be that MacOS 9.2 is trying to access the sound chip registers which
>>>      >     isn't implemented in QEMU for the moment (I have a separate screamer branch
>>>      >     available, but it's not ready for primetime yet). In theory they shouldn't be
>>>      >     accessed at all because the sound device isn't present in the OpenBIOS device tree,
>>>      >     but this is all fairly old stuff.
>>>      >
>>>      >     Does implementing the sound registers using a dummy device help at all?
>>>      >
>>>      >
>>>      > My uneducated guess is that you stumbled on a longstanding, but intermittently
>>>      > occurring, issue specific to Mac OS 9.2 related to sound support over USB in Apple
>>>      > monitors.
>>>
>>>     I'm not sure I understand this: are there non-standard command line options being
>>>     used here other than "qemu-system-ppc -M mac99 -cdrom macos92.iso -boot d"?
>>>
>>>
>>>
>>> It must be my windows host ;-)
>>>
>>> qemu-system-ppc.exe -M mac99,via=pmu -cdrom C:\mac-iso\9.2.2.iso -boot d -L pc-bios
>>> crashes Mac OS with an address error. (with unpatched and patched builds).
>>
>> Same on Linux. I get an invalid opcode. QEMU 7.2 work fine though.
>>
>> C.
> 
> That certainly shouldn't happen, and if it worked in 7.2 then there's definitely a regression which has crept in there somewhere. I'll try and bisect this at some point soon, but feel free to try and beat me ;)

bisect points to :

commit e506ad6a05c806bbef460a7d014a184ff8d707a6
Author: Richard Henderson <richard.henderson@linaro.org>
Date:   Mon Mar 6 04:30:11 2023 +0300

     accel/tcg: Pass last not end to tb_invalidate_phys_range
     
     Pass the address of the last byte to be changed, rather than
     the first address past the last byte.  This avoids overflow
     when the last page of the address space is involved.
     
     Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
     Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

  include/exec/exec-all.h   |  2 +-
  accel/tcg/tb-maint.c      | 31 ++++++++++++++++---------------
  accel/tcg/translate-all.c |  2 +-
  accel/tcg/user-exec.c     |  2 +-
  softmmu/physmem.c         |  2 +-
  5 files changed, 20 insertions(+), 19 deletions(-)


I think the instruction is fnmadds. Needs more digging.

Thanks,

C.

Re: [PATCH 0/4] target/ppc: Catch invalid real address accesses
Posted by Cédric Le Goater 10 months, 3 weeks ago
>>>> qemu-system-ppc.exe -M mac99,via=pmu -cdrom C:\mac-iso\9.2.2.iso -boot d -L pc-bios
>>>> crashes Mac OS with an address error. (with unpatched and patched builds).
>>>
>>> Same on Linux. I get an invalid opcode. QEMU 7.2 work fine though.
>>>
>>> C.
>>
>> That certainly shouldn't happen, and if it worked in 7.2 then there's definitely a regression which has crept in there somewhere. I'll try and bisect this at some point soon, but feel free to try and beat me ;)
> 
> bisect points to :
> 
> commit e506ad6a05c806bbef460a7d014a184ff8d707a6
> Author: Richard Henderson <richard.henderson@linaro.org>
> Date:   Mon Mar 6 04:30:11 2023 +0300
> 
>      accel/tcg: Pass last not end to tb_invalidate_phys_range
>      Pass the address of the last byte to be changed, rather than
>      the first address past the last byte.  This avoids overflow
>      when the last page of the address space is involved.
>      Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
>      Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> 
>   include/exec/exec-all.h   |  2 +-
>   accel/tcg/tb-maint.c      | 31 ++++++++++++++++---------------
>   accel/tcg/translate-all.c |  2 +-
>   accel/tcg/user-exec.c     |  2 +-
>   softmmu/physmem.c         |  2 +-
>   5 files changed, 20 insertions(+), 19 deletions(-)
> 
> 
> I think the instruction is fnmadds. Needs more digging.

the invalid opcode is just a symptom of something bad happening.

C.


Re: [PATCH 0/4] target/ppc: Catch invalid real address accesses
Posted by Mark Cave-Ayland 10 months, 3 weeks ago
On 28/06/2023 08:17, Cédric Le Goater wrote:

>>>>> qemu-system-ppc.exe -M mac99,via=pmu -cdrom C:\mac-iso\9.2.2.iso -boot d -L pc-bios
>>>>> crashes Mac OS with an address error. (with unpatched and patched builds).
>>>>
>>>> Same on Linux. I get an invalid opcode. QEMU 7.2 work fine though.
>>>>
>>>> C.
>>>
>>> That certainly shouldn't happen, and if it worked in 7.2 then there's definitely a 
>>> regression which has crept in there somewhere. I'll try and bisect this at some 
>>> point soon, but feel free to try and beat me ;)
>>
>> bisect points to :
>>
>> commit e506ad6a05c806bbef460a7d014a184ff8d707a6
>> Author: Richard Henderson <richard.henderson@linaro.org>
>> Date:   Mon Mar 6 04:30:11 2023 +0300
>>
>>      accel/tcg: Pass last not end to tb_invalidate_phys_range
>>      Pass the address of the last byte to be changed, rather than
>>      the first address past the last byte.  This avoids overflow
>>      when the last page of the address space is involved.
>>      Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
>>      Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
>>
>>   include/exec/exec-all.h   |  2 +-
>>   accel/tcg/tb-maint.c      | 31 ++++++++++++++++---------------
>>   accel/tcg/translate-all.c |  2 +-
>>   accel/tcg/user-exec.c     |  2 +-
>>   softmmu/physmem.c         |  2 +-
>>   5 files changed, 20 insertions(+), 19 deletions(-)
>>
>>
>> I think the instruction is fnmadds. Needs more digging.
> 
> the invalid opcode is just a symptom of something bad happening.
> 
> C.

Indeed, it appears to be a copy/paste error within that commit. I've just posted a 
proposed fix for this: 
https://lore.kernel.org/qemu-devel/20230629082522.606219-1-mark.cave-ayland@ilande.co.uk/T/.

Nick: you may wish to try your series again with this fix applied to see if there are 
still problems with the CPUs used in the Mac machines.


ATB,

Mark.


Re: [PATCH 0/4] target/ppc: Catch invalid real address accesses
Posted by Cédric Le Goater 10 months, 3 weeks ago
On 6/29/23 10:29, Mark Cave-Ayland wrote:
> On 28/06/2023 08:17, Cédric Le Goater wrote:
> 
>>>>>> qemu-system-ppc.exe -M mac99,via=pmu -cdrom C:\mac-iso\9.2.2.iso -boot d -L pc-bios
>>>>>> crashes Mac OS with an address error. (with unpatched and patched builds).
>>>>>
>>>>> Same on Linux. I get an invalid opcode. QEMU 7.2 work fine though.
>>>>>
>>>>> C.
>>>>
>>>> That certainly shouldn't happen, and if it worked in 7.2 then there's definitely a regression which has crept in there somewhere. I'll try and bisect this at some point soon, but feel free to try and beat me ;)
>>>
>>> bisect points to :
>>>
>>> commit e506ad6a05c806bbef460a7d014a184ff8d707a6
>>> Author: Richard Henderson <richard.henderson@linaro.org>
>>> Date:   Mon Mar 6 04:30:11 2023 +0300
>>>
>>>      accel/tcg: Pass last not end to tb_invalidate_phys_range
>>>      Pass the address of the last byte to be changed, rather than
>>>      the first address past the last byte.  This avoids overflow
>>>      when the last page of the address space is involved.
>>>      Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
>>>      Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
>>>
>>>   include/exec/exec-all.h   |  2 +-
>>>   accel/tcg/tb-maint.c      | 31 ++++++++++++++++---------------
>>>   accel/tcg/translate-all.c |  2 +-
>>>   accel/tcg/user-exec.c     |  2 +-
>>>   softmmu/physmem.c         |  2 +-
>>>   5 files changed, 20 insertions(+), 19 deletions(-)
>>>
>>>
>>> I think the instruction is fnmadds. Needs more digging.
>>
>> the invalid opcode is just a symptom of something bad happening.
>>
>> C.
> 
> Indeed, it appears to be a copy/paste error within that commit. I've just posted a proposed fix for this: https://lore.kernel.org/qemu-devel/20230629082522.606219-1-mark.cave-ayland@ilande.co.uk/T/.

Looks good. I could boot macos 9.2.1 from an iso and 9.2.2 from disk.

> Nick: you may wish to try your series again with this fix applied to see if there are still problems with the CPUs used in the Mac machines.

Looks good too, with these :

  [PATCH v2 1/4] target/ppc: Machine check on invalid real address
  [PATCH v2 2/4] target/ppc: Move common check in machine check
  [PATCH v2 3/4] target/ppc: Make checkstop actually stop the system


Patch 4 (attn) doesn't compile but it's an extra behavior on top of
checkstop.

Thanks,

C.


Re: [PATCH 0/4] target/ppc: Catch invalid real address accesses
Posted by Nicholas Piggin 10 months, 3 weeks ago
On Thu Jun 29, 2023 at 7:05 PM AEST, Cédric Le Goater wrote:
> On 6/29/23 10:29, Mark Cave-Ayland wrote:
> > On 28/06/2023 08:17, Cédric Le Goater wrote:
> > 
> >>>>>> qemu-system-ppc.exe -M mac99,via=pmu -cdrom C:\mac-iso\9.2.2.iso -boot d -L pc-bios
> >>>>>> crashes Mac OS with an address error. (with unpatched and patched builds).
> >>>>>
> >>>>> Same on Linux. I get an invalid opcode. QEMU 7.2 work fine though.
> >>>>>
> >>>>> C.
> >>>>
> >>>> That certainly shouldn't happen, and if it worked in 7.2 then there's definitely a regression which has crept in there somewhere. I'll try and bisect this at some point soon, but feel free to try and beat me ;)
> >>>
> >>> bisect points to :
> >>>
> >>> commit e506ad6a05c806bbef460a7d014a184ff8d707a6
> >>> Author: Richard Henderson <richard.henderson@linaro.org>
> >>> Date:   Mon Mar 6 04:30:11 2023 +0300
> >>>
> >>>      accel/tcg: Pass last not end to tb_invalidate_phys_range
> >>>      Pass the address of the last byte to be changed, rather than
> >>>      the first address past the last byte.  This avoids overflow
> >>>      when the last page of the address space is involved.
> >>>      Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
> >>>      Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> >>>
> >>>   include/exec/exec-all.h   |  2 +-
> >>>   accel/tcg/tb-maint.c      | 31 ++++++++++++++++---------------
> >>>   accel/tcg/translate-all.c |  2 +-
> >>>   accel/tcg/user-exec.c     |  2 +-
> >>>   softmmu/physmem.c         |  2 +-
> >>>   5 files changed, 20 insertions(+), 19 deletions(-)
> >>>
> >>>
> >>> I think the instruction is fnmadds. Needs more digging.
> >>
> >> the invalid opcode is just a symptom of something bad happening.
> >>
> >> C.
> > 
> > Indeed, it appears to be a copy/paste error within that commit. I've just posted a proposed fix for this: https://lore.kernel.org/qemu-devel/20230629082522.606219-1-mark.cave-ayland@ilande.co.uk/T/.
>
> Looks good. I could boot macos 9.2.1 from an iso and 9.2.2 from disk.
>
> > Nick: you may wish to try your series again with this fix applied to see if there are still problems with the CPUs used in the Mac machines.
>
> Looks good too, with these :
>
>   [PATCH v2 1/4] target/ppc: Machine check on invalid real address
>   [PATCH v2 2/4] target/ppc: Move common check in machine check
>   [PATCH v2 3/4] target/ppc: Make checkstop actually stop the system

In the v2 series I removed the machine check for 970 btw so there
might still be invalid memory access.

Thanks,
Nick
Re: [PATCH 0/4] target/ppc: Catch invalid real address accesses
Posted by Nicholas Piggin 10 months, 3 weeks ago
On Mon Jun 26, 2023 at 11:35 PM AEST, Cédric Le Goater wrote:
> On 6/23/23 14:37, Cédric Le Goater wrote:
> > On 6/23/23 11:10, Peter Maydell wrote:
> >> On Fri, 23 Jun 2023 at 09:21, Nicholas Piggin <npiggin@gmail.com> wrote:
> >>>
> >>> ppc has always silently ignored access to real (physical) addresses
> >>> with nothing behind it, which can make debugging difficult at times.
> >>>
> >>> It looks like the way to handle this is implement the transaction
> >>> failed call, which most target architectures do. Notably not x86
> >>> though, I wonder why?
> >>
> >> Much of this is historical legacy. QEMU originally had no
> >> concept of "the system outside the CPU returns some kind
> >> of bus error and the CPU raises an exception for it".
> >> This is turn is (I think) because the x86 PC doesn't do
> >> that: you always get back some kind of response, I think
> >> -1 on reads and writes ignored. We added the do_transaction_failed
> >> hook largely because we wanted it to give more accurate
> >> emulation of this kind of thing on Arm, but as usual with new
> >> facilities we left the other architectures to do it themselves
> >> if they wanted -- by default the behaviour remained the same.
> >> Some architectures have picked it up; some haven't.
> >>
> >> The main reason it's a bit of a pain to turn the correct
> >> handling on is because often boards don't actually implement
> >> all the devices they're supposed to. For a pile of legacy Arm
> >> boards, especially where we didn't have good test images,
> >> we use the machine flag ignore_memory_transaction_failures to
> >> retain the legacy behaviour. (This isn't great because it's
> >> pretty much going to mean we have that flag set on those
> >> boards forever because nobody is going to care enough to
> >> investigate and test.)
> >>
> >>> Other question is, sometimes I guess it's nice to avoid crashing in
> >>> order to try to quickly get past some unimplemented MMIO. Maybe a
> >>> command line option or something could turn it off? It should
> >>> probably be a QEMU-wide option if so, so that shouldn't hold this
> >>> series up, I can propose a option for that if anybody is worried
> >>> about it.
> >>
> >> I would not recommend going any further than maybe setting the
> >> ignore_memory_transaction_failures flag for boards you don't
> >> care about. (But in an ideal world, don't set it and deal with
> >> any bug reports by implementing stub versions of missing devices.
> >> Depends how confident you are in your test coverage.)
> > 
> > It seems it broke the "mac99" and  powernv10 machines, using the
> > qemu-ppc-boot images which are mostly buildroot. See below for logs.
> > 
> > Adding Mark for further testing on Mac OS.
>   
>
> Mac OS 9.2 fails to boot with a popup saying :
>         
>          Sorry, a system error occured.
>          "Sound Manager"
>            address error
>          To temporarily turn off extensions, restart and
>          hold down the shift key
>
>
> Darwin and Mac OSX look OK.

Might have to to restrict it to POWER machines for now then. Seems like
it will break working systems.

We could just log a guest error for the others.

Thanks,
Nick
Re: [PATCH 0/4] target/ppc: Catch invalid real address accesses
Posted by Cédric Le Goater 10 months, 3 weeks ago
>>> It seems it broke the "mac99" and  powernv10 machines, using the
>>> qemu-ppc-boot images which are mostly buildroot. See below for logs.
>>>
>>> Adding Mark for further testing on Mac OS.
>>    
>>
>> Mac OS 9.2 fails to boot with a popup saying :
>>          
>>           Sorry, a system error occured.
>>           "Sound Manager"
>>             address error
>>           To temporarily turn off extensions, restart and
>>           hold down the shift key
>>
>>
>> Darwin and Mac OSX look OK.
> 
> Might have to to restrict it to POWER machines for now then. Seems like
> it will break working systems.
> 
> We could just log a guest error for the others.

The "mac99" and powernv machines run fine now with the patches you
and Fred just sent. There is a "fix" window after soft freeze when
we can set the 'ignore_memory_transaction_failures' flag as Peter
suggested.

Thanks,

C.


Re: [PATCH 0/4] target/ppc: Catch invalid real address accesses
Posted by Philippe Mathieu-Daudé 10 months, 4 weeks ago
On 23/6/23 14:37, Cédric Le Goater wrote:
> On 6/23/23 11:10, Peter Maydell wrote:
>> On Fri, 23 Jun 2023 at 09:21, Nicholas Piggin <npiggin@gmail.com> wrote:
>>>
>>> ppc has always silently ignored access to real (physical) addresses
>>> with nothing behind it, which can make debugging difficult at times.
>>>
>>> It looks like the way to handle this is implement the transaction
>>> failed call, which most target architectures do. Notably not x86
>>> though, I wonder why?
>>
>> Much of this is historical legacy. QEMU originally had no
>> concept of "the system outside the CPU returns some kind
>> of bus error and the CPU raises an exception for it".
>> This is turn is (I think) because the x86 PC doesn't do
>> that: you always get back some kind of response, I think
>> -1 on reads and writes ignored. We added the do_transaction_failed
>> hook largely because we wanted it to give more accurate
>> emulation of this kind of thing on Arm, but as usual with new
>> facilities we left the other architectures to do it themselves
>> if they wanted -- by default the behaviour remained the same.
>> Some architectures have picked it up; some haven't.
>>
>> The main reason it's a bit of a pain to turn the correct
>> handling on is because often boards don't actually implement
>> all the devices they're supposed to. For a pile of legacy Arm
>> boards, especially where we didn't have good test images,
>> we use the machine flag ignore_memory_transaction_failures to
>> retain the legacy behaviour. (This isn't great because it's
>> pretty much going to mean we have that flag set on those
>> boards forever because nobody is going to care enough to
>> investigate and test.)
>>
>>> Other question is, sometimes I guess it's nice to avoid crashing in
>>> order to try to quickly get past some unimplemented MMIO. Maybe a
>>> command line option or something could turn it off? It should
>>> probably be a QEMU-wide option if so, so that shouldn't hold this
>>> series up, I can propose a option for that if anybody is worried
>>> about it.
>>
>> I would not recommend going any further than maybe setting the
>> ignore_memory_transaction_failures flag for boards you don't
>> care about. (But in an ideal world, don't set it and deal with
>> any bug reports by implementing stub versions of missing devices.
>> Depends how confident you are in your test coverage.)
> 
> It seems it broke the "mac99" and  powernv10 machines, using the
> qemu-ppc-boot images which are mostly buildroot. See below for logs.

Since commit 21786c7e59 ("softmmu/memory: Log invalid memory accesses")
you can log the failed transaction with '-d guest_errors'. See for
example commit a13bfa5a05 ("hw/mips/jazz: Map the UART devices
unconditionally"):

   $ qemu-system-mips64el -M magnum -d guest_errors,unimp -bios NTPROM.RAW
   Invalid access at addr 0x80007004, size 1, region '(null)', reason: 
rejected
   Invalid access at addr 0x80007001, size 1, region '(null)', reason: 
rejected
   Invalid access at addr 0x80007002, size 1, region '(null)', reason: 
rejected
   Invalid access at addr 0x80007003, size 1, region '(null)', reason: 
rejected
   Invalid access at addr 0x80007004, size 1, region '(null)', reason: 
rejected

Boards booting successfully with ignore_memory_transaction_failures
set can often remove this flag by mapping missing accessed ranges as
TYPE_UNIMPLEMENTED_DEVICE. (You can then log the same accesses using
'-d unimp').


Re: [PATCH 0/4] target/ppc: Catch invalid real address accesses
Posted by BALATON Zoltan 10 months, 4 weeks ago
On Sat, 24 Jun 2023, Philippe Mathieu-Daudé wrote:
> On 23/6/23 14:37, Cédric Le Goater wrote:
>> On 6/23/23 11:10, Peter Maydell wrote:
>>> On Fri, 23 Jun 2023 at 09:21, Nicholas Piggin <npiggin@gmail.com> wrote:
>>>> 
>>>> ppc has always silently ignored access to real (physical) addresses
>>>> with nothing behind it, which can make debugging difficult at times.
>>>> 
>>>> It looks like the way to handle this is implement the transaction
>>>> failed call, which most target architectures do. Notably not x86
>>>> though, I wonder why?
>>> 
>>> Much of this is historical legacy. QEMU originally had no
>>> concept of "the system outside the CPU returns some kind
>>> of bus error and the CPU raises an exception for it".
>>> This is turn is (I think) because the x86 PC doesn't do
>>> that: you always get back some kind of response, I think
>>> -1 on reads and writes ignored. We added the do_transaction_failed
>>> hook largely because we wanted it to give more accurate
>>> emulation of this kind of thing on Arm, but as usual with new
>>> facilities we left the other architectures to do it themselves
>>> if they wanted -- by default the behaviour remained the same.
>>> Some architectures have picked it up; some haven't.
>>> 
>>> The main reason it's a bit of a pain to turn the correct
>>> handling on is because often boards don't actually implement
>>> all the devices they're supposed to. For a pile of legacy Arm
>>> boards, especially where we didn't have good test images,
>>> we use the machine flag ignore_memory_transaction_failures to
>>> retain the legacy behaviour. (This isn't great because it's
>>> pretty much going to mean we have that flag set on those
>>> boards forever because nobody is going to care enough to
>>> investigate and test.)
>>> 
>>>> Other question is, sometimes I guess it's nice to avoid crashing in
>>>> order to try to quickly get past some unimplemented MMIO. Maybe a
>>>> command line option or something could turn it off? It should
>>>> probably be a QEMU-wide option if so, so that shouldn't hold this
>>>> series up, I can propose a option for that if anybody is worried
>>>> about it.
>>> 
>>> I would not recommend going any further than maybe setting the
>>> ignore_memory_transaction_failures flag for boards you don't
>>> care about. (But in an ideal world, don't set it and deal with
>>> any bug reports by implementing stub versions of missing devices.
>>> Depends how confident you are in your test coverage.)
>> 
>> It seems it broke the "mac99" and  powernv10 machines, using the
>> qemu-ppc-boot images which are mostly buildroot. See below for logs.
>
> Since commit 21786c7e59 ("softmmu/memory: Log invalid memory accesses")
> you can log the failed transaction with '-d guest_errors'. See for
> example commit a13bfa5a05 ("hw/mips/jazz: Map the UART devices

This reminds me I'd still want to split this from guest_errors as 
discussed here:

https://lists.nongnu.org/archive/html/qemu-devel/2023-02/msg08757.html

Can we get a decision on how to call these debug options?

> unconditionally"):
>
>  $ qemu-system-mips64el -M magnum -d guest_errors,unimp -bios NTPROM.RAW
>  Invalid access at addr 0x80007004, size 1, region '(null)', reason: 
> rejected
>  Invalid access at addr 0x80007001, size 1, region '(null)', reason: 
> rejected
>  Invalid access at addr 0x80007002, size 1, region '(null)', reason: 
> rejected
>  Invalid access at addr 0x80007003, size 1, region '(null)', reason: 
> rejected
>  Invalid access at addr 0x80007004, size 1, region '(null)', reason: 
> rejected
>
> Boards booting successfully with ignore_memory_transaction_failures
> set can often remove this flag by mapping missing accessed ranges as
> TYPE_UNIMPLEMENTED_DEVICE. (You can then log the same accesses using
> '-d unimp').

The mac99 may have a lot of unimplemented devices and they are also not 
quite documented so it may even be difficult to find where they should be.

Regards,
BALATON Zoltan