We have three R820s that will crash and reboot on us within minutes of each other usually around the same time between 6:00 and 7:00 AM either Thursday or Friday morning. Initially; the error references PSU2; we did some research and decided to update the firmware on one of the hosts to test. After successfully updating the firmware on one of the devices sure enough the other two servers that weren't updated crashed. The server with the updated firmware did not crash so we though we had the issue figured out and decided to update the firmware on the remaining two servers.
Unfortunately the firmware updated bricked one of the PSU's so we're waiting on a replacement there but it seems the firmware updates did not work because sure enough this morning the two remaining servers crashed; this time the error was much more vague:
Here's the entry from one server:
Thu Dec 18 06:30:25 2014 | A runtime critical stop occurred. |
Here's the entry from the other:
Thu Dec 18 06:29:51 2014 | A runtime critical stop occurred. |
Error seems to have gotten a lot more general now!
Anyone have any idea where to look now?
Thanks!