Hi,
A recent failure of one of my systems revealed an issue with memory error
logging where the flood of messages produced, which reported corrected ECC
errors, made the system unusable despite the errors themselves having been
recovered from and the messages serving informational purpose only.
I took the opportunity and actually verified the rate-limiting does its
purpose with the offending system before cleaning memory module contacts,
which has cured the original problem, the third time in ~25 years I've had
the system for -- not too bad, but clearly a recurring issue.
For consistency I have also updated support for the other two DEC memory
system designs, although they're parity-based and therefore memory errors
are fatal and consequently less likely to cause a message flood, although
in principle still possible where a faulty memory location causes a bus
error exception to kill user processes repeatedly. They seem not to have
the issue with memory contacts though, which use the common SIMM design
rather than 0.1"-pitch PCB connectors.
Please apply.
Maciej