Last night we had one of our critical control systems crash and power off. The only indication of the problem is in the ESM logs. When the system was powered on the error condition on the panel when away. All appears fine now. The big question is why? And will it happen again?
This is a PE6850 quad Xeon 3.0GHz Running Win2000 server SP/4
BIOS A00
BMC ver 1.33
Backplane firmware 1.0
ESM log:
Sun Dec 03 23:18:05 2006 PROC_2 Status processor sensor recovered from IERR
Sun Dec 03 23:18:05 2006 PROC_1 Status processor sensor recovered from IERR
Sun Dec 03 21:48:14 2006 PROC_2 Status processor sensor IERR
Sun Dec 03 21:48:14 2006 PROC_1 Status processor sensor IERR
I know the obvious answer is to upgrade the BIOS, firmware, and such, but if there is an answer to why this might have happened I would appreciate it.
Thanks,
Bill