Himself Posted January 21, 2018 Share Posted January 21, 2018 I have a random problem I'm trying to solve. I have two HP Microservers (N54L) both running Unraid 6.3 - been running perfectly for over a year. I upgraded the 1st to 6.4 and everything is fine. Simple process. Reboot. Perfection. Since upgrading the 2nd it runs for a short while and then Unraid stops. The server is still powered and if I press the power button the server will start without issue, and then after a short period shut down again. Had to be a hardware fault. So I've swapped the disks into the other Microserver (the one that upgraded and ran without error) and the problem has followed the disks and the memory stick they boot from. I've changed the power lead and the ethernet cable. Tried a different port on the switch, just in case. The server never looses power, unraid just stop running. Even the green light on the power switch stays green. Moving to another chassis seems a fruitless move. Is there a way to going back to 6.3 as this is what I changed? (First rule of bug fixing - What changed?) Preserving the data on the array is the primary concern ~ naturally :-) ~ or just getting the system to run long enough to copy it off. Most bizarre... any suggestions? Link to comment
Squid Posted January 21, 2018 Share Posted January 21, 2018 Can you still access the server via SSH or at the local monitor / keyboard? If so, diagnostics 5 hours ago, Himself said: Is there a way to going back to 6.3 On the flash drive, copy the contents of the previous folder to the root of the flash and reboot Link to comment
Himself Posted January 21, 2018 Author Share Posted January 21, 2018 Thank you Squid - I thought I'd deleted this post as I believe I've found the problem - S3 Sleep The strange thing was one server was fine and the other acted up. Swap the boot USB and the disks and the problem moves too. Removing S3 Sleep appears to have removed the problem. Good news is that the server upgrade I've been planning has finally been done. There is always an upside Link to comment
Frank1940 Posted January 21, 2018 Share Posted January 21, 2018 One thing you might check is the BIOS version on the two MB's... Link to comment
Himself Posted January 21, 2018 Author Share Posted January 21, 2018 Nope ... thought I had it fixed but I don't So, Original problem that moved from one HP Microserver to another, fixed by removing S3 Sleep The new large server I build (i7 P6X58D motherboard) ran for 2 hours when booted with the GUI. Boot the same server headless and it goes to sleep. Same basic issue, Shell is still active, but no disk or IP connectivity. Test that is currently running is to reboot the new server with the GUI running and try to pump the data from the HP to the New Server and see if it will stay running. Link to comment
Squid Posted January 21, 2018 Share Posted January 21, 2018 2 minutes ago, Himself said: The new large server I build (i7 P6X58D motherboard) ran for 2 hours when booted with the GUI. Boot the same server headless and it goes to sleep. Same basic issue, Shell is still active, but no disk or IP connectivity. The sleep plugin currently has issues on 6.4 Link to comment
Himself Posted January 21, 2018 Author Share Posted January 21, 2018 So ... Sleep has been removed. It isn't the Gui boot either. The server ran for a couple of hours before developing this "feature" Fix Common problems has just thrown up a "Call Traces" problem Time to go back to 6.3 I think ... Wonder if I can remember Unix commands .... but I'll run diagnostics first Link to comment
Himself Posted January 21, 2018 Author Share Posted January 21, 2018 Diagnostics run and attached ..forgive the unimaginative server name :-) If anyone can help a Unix idiot that last looked at the Unix command line 15 years ago, with some instructions on how to downgrade I'd be obliged. Total disclaimer: if you are kind enough to help me, the risk of any action is on me :-) deathstar-diagnostics-20180121-1418.zip Link to comment
Squid Posted January 21, 2018 Share Posted January 21, 2018 Diagnostics are from 3 minutes after a reboot. No call trace or anything to go on. Link to comment
Himself Posted January 21, 2018 Author Share Posted January 21, 2018 OK -- Finally workout a way of letting it fail and then getting the diagnostics. It seems to be losing network connectivity as the GUI is running (localhost) but nothing is accessible on across the network. Hopefully, these diagnostics point the way. Thanks for the help. Much appreciated deathstar-diagnostics-20180121-1809.zip Link to comment
Squid Posted January 21, 2018 Share Posted January 21, 2018 Your addon ethernet card is having a heart attack. If you got it to replicate the trace via playing with the cabling, etc then don't worry about that. But the trace is when a transmission timed out, and the driver is maybe a little too chatty in letting you know about it by issuing a trace. Link to comment
Himself Posted January 21, 2018 Author Share Posted January 21, 2018 The ethernet ports are the built-in ones on the motherboard. Time to change the cable and the switch port - although I would have said I'd done that, but with all the rebuilding, I may have reused something. Thanks for your help Squid. I do appreciate it Cheers Link to comment
Himself Posted January 21, 2018 Author Share Posted January 21, 2018 Still running tests, but on the "new" i7 based server, it looks to have been jumbo frames that were the problem. With an MTU of 1500 we seem to be running without error. Link to comment
Himself Posted January 24, 2018 Author Share Posted January 24, 2018 Two days in a 6TB of data transferred. All is well. Jumbo frames were the problem with the "new" server. Sleep seems to have been the problem with one of the HP Microservers Thanks to Squid for walking me through a few processes and some fault finding. Link to comment
Recommended Posts
Archived
This topic is now archived and is closed to further replies.