OK, quite a lengthy summary here, which has been going on for over a year and today I am confident that I have got to the bottom of it.
I have previously posted several threads which provide a lot of details for each occasion;
But in summary... I have had some serious issues with all versions of UnRAID after 6.5.0 with reliability...
The servers concerned are 2 x HP MicroServer G8's, 2 x HP DL380p G8's and today 1 x ML30 G9. Yes... I have replaced two servers thinking it was a hardware issue 😞
The machines are configured to boot in legacy mode and the USB key with UnRAID is installed on the internal USB port. However when the machine boots it sometimes shows a critical BIOS error in the HP ILO and once this error shows, you have no keyboard/mouse control in the ILO remote console screen - so all you can do is to force the machine to reboot by using the hold power option in the ILO. Then in most cases after the machine boots, it fails to recognise the USB key in the internal USB port and the only way I'd found to reboot the machine into UnRAID was to remove the USB key to an external port and reboot.
As I had never managed to sort this out, I bought a new ML30 G9 over Christmas and just built it with a trial key using v6.8.1 today. As soon as it booted I selected UnRAID with GUI option and it booted into UnRAID. Immediately the server showed a critical PCI-Express error and from the ILO remote console the keybpard/mouse wouldnt work... So I logged into UnRAID from the LAN GUI (which worked OK) and shut down the machine. It wouldnt shut down, so I had to foce it with a 'hold-power button' from ILO. When the machine was powered down it still showed the Critical BIOS PCI-Express error, so I removed the power cord and pressed the power button to discharge everything and when I reconnected power and got on the ILO, the error had cleared.
I then rebooted the machine and forgot to change into 'UnRAID with GUI' boot option, which I always have as default so I can get into UnRAID from ILO. I noticed the error didn't apear, so I rebooted and chose 'UnRAID with GUI' option and immediately the critical error flashed up!
I have a friend who runs UnRAID 6.8.0 on another Microserver G8 which is completely remote from me. He switched his to 'UnRAID with GUI' mode and also found he has lost ILO keyboard/mouse - so he rebooted and its now stuck in a 'boot device not found' error as it would seem it cant recognise the internal USB port at the moment. When he gets home he will cold boot it and I know it will be fine again!
I have attached two diagnistic.zip files from my ML30 G9 today - one was generated when the machine DIDN'T have the critical BIOS error and one when it was in Critical condition - in case they are different! The one generated at 11:46 is in normal condition and the 11:55 one is in critical condition.
So far I have replaced my Microserver with another Microserver and then bought this ML30 G9 as I thought it was a compatibility issue with the Microserver and I replaced my DL380 G8 in the datacentre as I thought there was a problem with the original machine... so all in all its been quite costly for me to get to this point, so it would be nice to know that this IS the cause and whether it can be fixed.
I have the virgin ML30 G9 with a trial key and no data or config - so if you want anything done on this to try and find the cause please let me know.
Hope to hear back soon!
hector2-diagnostics-20200117-1146.zip hector2-diagnostics-20200117-1155.zip
Recommended Comments
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.