On 18/01/2024 07.03, Michael Tokarev wrote:
> 17.01.2024 01:31, Matthew Rosato:
>> Commit ef1535901a0 (re-)introduced an issue where passthrough ISM devices
>> on s390x would enter an error state after reboot. This was previously fixed
>> by 03451953c79e, using device reset callbacks, however the change in
>> ef1535901a0 effectively triggers a cold reset of the pci bus before the
>> device reset callbacks are triggered.
>>
>> To resolve this, this series proposes to remove the use of the reset callback
>> for ISM cleanup and instead trigger ISM reset from subsystem_reset before
>> triggering bus resets. This has to happen before the bus resets because the
>> reset of s390-pcihost will trigger reset of the PCI bus followed by the
>> s390-pci bus, and the former will trigger vfio-pci reset / the aperture-wide
>> unmap that ISM gets upset about.
>> /s390-pcihost (s390-pcihost)
>> /pci.0 (PCI)
>> /s390-pcibus.0 (s390-pcibus)
>> While fixing this, it was also noted that kernel warnings could be seen that
>> indicate a guest ISC reference count error. That's because in some reset
>> cases we were not bothering to disable AIF, but would again re-enable it
>> after
>> the reset (causing the reference count to grow erroneously). This was a base
>> issue that went unnoticed because the kernel previously did not detect and
>> issue a warning for this scenario.
>
> Is it a -stable material, or not worth picking up for stable?
It's definitely stable material, but IIUC there will be a v2 with some minor
fixes.
Thomas