Today, I tried to apply 3 updates to 3 x Dell R620's as follows:
- BIOS 2.4.3 to 2.5.2
- Intel NIC firmware from 16.0.24 to 16.5.20
- iDRAC Firmware Version to 2.10.10 (iDRAC Settings Version remains 1.65.65.04)
All 3 R620's had identical firmware to start.
I applied the updates to the 3 R620 systems at the same time through iDRAC Express Web GUI update mechanism.
On the first 2 of the systems, the updates worked fine.
On the last system, I'm in trouble. When I came back to that system, it was "booting" and was saying "Entering Lifecycle Controller ..." but after half an hour it didn't! I noticed that every time I would boot it, it would do the same thing .. "Entering Lifecycle Controller"... then when it got up to the point where Intel Boot Agent would initialize, there would be two dots, then a blank line.. then the machine would just sit there.
Through racadm, I managed to reset the lifecycle controller state, and remove the job queue hoping that I could just reflash the updates and everything would be fine. Now, the system does exactly the same thing except that it doesn't try to enter the Lifecycle Controller anymore. Now, when it's sitting there, it's not "completely" hung - that is, if I hit F2 it will say that it's entering system setup, hit F10 it will say that it's going to PXE boot..etc... in fact you can hit all the various F keys and it will say that it's going to Enter System Setup, PXE Boot, Enter BIOS Boot Menu, etc...
I'm suspicious it has to do with a botched NIC upgrade.. Here's what I see in the racadm swinventory for BIOS:
-------------------------------------------------------------------
ComponentType = BIOS
ElementName = BIOS
FQDD = BIOS.Setup.1-1
InstallationDate = NA
Rollback Version = 2.4.3
-------------------------------------------------------------------
ComponentType = BIOS
ElementName = BIOS
FQDD = BIOS.Setup.1-1
InstallationDate = NA
Available Version = 2.5.2
-------------------------------------------------------------------
ComponentType = BIOS
ElementName = BIOS
FQDD = BIOS.Setup.1-1
InstallationDate = 2015-03-12T12:49:29Z
Current Version = 2.4.3
-------------------------------------------------------------------
Hmmm ... Current Version seems to be 2.4.3, with AVAILABLE version 2.5.2. (which is odd since when the machine boots it reports BIOS 2.5.2)
and for NIC:
-------------------------------------------------------------------
ComponentType = FIRMWARE
ElementName = Intel(R) Ethernet 10G 4P X540/I350 rNDC - B8:CA:3A:69:3A:62
FQDD = NIC.Integrated.1-2-1
InstallationDate = NA
Rollback Version = 16.5.20
-------------------------------------------------------------------
ComponentType = FIRMWARE
ElementName = Intel(R) Ethernet 10G 4P X540/I350 rNDC - B8:CA:3A:69:3A:62
FQDD = NIC.Integrated.1-2-1
InstallationDate = NA
Available Version = 16.5.20
-------------------------------------------------------------------
ComponentType = FIRMWARE
ElementName = Intel(R) Ethernet 10G 4P X540/I350 rNDC - B8:CA:3A:69:3A:62
FQDD = NIC.Integrated.1-2-1
InstallationDate = 2015-03-13T10:50:01Z
Current Version = 16.0.24
-------------------------------------------------------------------
ComponentType = FIRMWARE
ElementName = Intel(R) Gigabit 4P X540/I350 rNDC - B8:CA:3A:69:3A:64
FQDD = NIC.Integrated.1-3-1
InstallationDate = NA
Rollback Version = 16.5.20
-------------------------------------------------------------------
ComponentType = FIRMWARE
ElementName = Intel(R) Gigabit 4P X540/I350 rNDC - B8:CA:3A:69:3A:64
FQDD = NIC.Integrated.1-3-1
InstallationDate = NA
Available Version = 16.5.20
-------------------------------------------------------------------
ComponentType = FIRMWARE
ElementName = Intel(R) Gigabit 4P X540/I350 rNDC - B8:CA:3A:69:3A:64
FQDD = NIC.Integrated.1-3-1
InstallationDate = 2015-03-13T10:50:07Z
Current Version = 16.0.24
-------------------------------------------------------------------
ComponentType = FIRMWARE
ElementName = Intel(R) Gigabit 4P X540/I350 rNDC - B8:CA:3A:69:3A:65
FQDD = NIC.Integrated.1-4-1
InstallationDate = NA
Rollback Version = 16.5.20
-------------------------------------------------------------------
ComponentType = FIRMWARE
ElementName = Intel(R) Gigabit 4P X540/I350 rNDC - B8:CA:3A:69:3A:65
FQDD = NIC.Integrated.1-4-1
InstallationDate = NA
Available Version = 16.5.20
-------------------------------------------------------------------
ComponentType = FIRMWARE
ElementName = Intel(R) Gigabit 4P X540/I350 rNDC - B8:CA:3A:69:3A:65
FQDD = NIC.Integrated.1-4-1
InstallationDate = 2015-03-13T10:50:08Z
Current Version = 16.0.24
-------------------------------------------------------------------
ComponentType = FIRMWARE
ElementName = Intel(R) Gigabit 2P I350-t Adapter - A0:36:9F:2E:F8:30
FQDD = NIC.Slot.3-1-1
InstallationDate = NA
Rollback Version = 16.5.20
-------------------------------------------------------------------
ComponentType = FIRMWARE
ElementName = Intel(R) Gigabit 2P I350-t Adapter - A0:36:9F:2E:F8:30
FQDD = NIC.Slot.3-1-1
InstallationDate = NA
Available Version = 16.5.20
-------------------------------------------------------------------
ComponentType = FIRMWARE
ElementName = Intel(R) Gigabit 2P I350-t Adapter - A0:36:9F:2E:F8:30
FQDD = NIC.Slot.3-1-1
InstallationDate = 2015-03-13T10:50:03Z
Current Version = 16.0.24
-------------------------------------------------------------------
ComponentType = FIRMWARE
ElementName = Intel(R) Gigabit 2P I350-t Adapter - A0:36:9F:2E:F8:31
FQDD = NIC.Slot.3-2-1
InstallationDate = NA
Rollback Version = 16.5.20
-------------------------------------------------------------------
ComponentType = FIRMWARE
ElementName = Intel(R) Gigabit 2P I350-t Adapter - A0:36:9F:2E:F8:31
FQDD = NIC.Slot.3-2-1
InstallationDate = NA
Available Version = 16.5.20
-------------------------------------------------------------------
ComponentType = FIRMWARE
ElementName = Intel(R) Gigabit 2P I350-t Adapter - A0:36:9F:2E:F8:31
FQDD = NIC.Slot.3-2-1
InstallationDate = 2015-03-13T10:50:05Z
Current Version = 16.0.24
-------------------------------------------------------------------
So the versions got uploaded, but the current version is not the new version.
I've checked in the other systems, and it is.
Now, I've tried to go back into idrac7 web GUI, and rollback, but it says that it can't do that.
At the same time, the BIOS boot screen shows the BIOS as being 2.5.2 !
I had a chat with Dell. The conclusion was that the motherboard and possibly network daughtercard needs to be replaced... someone will come out on Friday to do that.. that being said, I'm 99% sure there's a way to fix this other than replacing the whole board. I say 99% sure because I suspect that to solve the problem needs the system to enter the Lifecycle controller, which isn't working right now.. and any job I can queue up through the web GUI or racadm isn't going to do much if it can't access the Lifecycle controller.
I can still access the idrac via web and ssh. I've tried to submit a job to the jobqueue to change LegacyBootProto to NONE from PXE, thinking this MIGHT get rid of the "Intel Boot Manager" prompt, but when the system boots, it doesn't process it:
racadm jobqueue view
-------------------------JOB QUEUE------------------------
[Job ID=JID_285566585779]
Job Name=Configure: NIC.Integrated.1-1-1
Status=Scheduled
Start Time=[Now]
Expiration Time=[Not Applicable]
Message=[JCP001: Task successfully scheduled.]
----------------------------------------------------------
I don't even know if the problem is really the boot manager or not.
I've thought about one other possibility ... if it is the boot manager... if I disable the daughterboard NIC through racadm, then I would immediately lose access to the idrac, and not be able to revert to the previous state, but if disabling the card would let me get further so that I could get back into the lifecycle controller, and fix things, then maybe it's worth it? Dunno.
Any thoughts?
Jason.