[UDPATED]Woke up this morning to find all my dockers unresponsive


Recommended Posts

I checked a few docker logs and i was receiving chown errors on pretty much all my running dockers.  I ran a safe reboot (mistake!!) and now I can't get back into either the local host gui or webgui.  During boot, on the terminal screen, there was a message about an unexpected character at the end of /var/tmp/network.cfg but the was completely obfuscated with garbled characters.  I was able to grab a diag from the localhost after the reboot.  Hopefully I'm not totally F'd here.  I can get around linux but I am very far from an expert.  Any suggestions?

andromeda-diagnostics-20181027-0842.zip

 

EDIT:  Running a memtest now 10:00AM

Edited by depreciated_
memtest
Link to comment

Alright for the last few hours, I've been trying different things here and there.  The biggest problem is the GUI will not load at all (with a monitor & KB plugged into the server with GUI mode running & the web GUI) and I kept finding corrupt config files on the flash drive. 

 

So i loaded a fresh install of Unraid, followed the instructions list on the wiki to transfer the few non corrupt settings i wanted to get the array back up and I'm getting whatever the hell THIS is every time I boot off this machine lol1666079799_ScreenShot2018-10-27at4_30_09PM.thumb.png.3f633014ce307d47f62d7cf94fe65479.png1676457597_ScreenShot2018-10-27at4_30_14PM.thumb.png.b16bc5c3f3430cd71b3f0e3429a00084.png 

 

I booted two more flash drives, on different ports on this server with the same results.  I can boot from either of these flash drives off 2 other machines, so I know the drives are good.  I'm back to running an extended memtest. 

 

Something has to be wrong on the hardware?  No bios changes were touched.

 

Link to comment

I'm baffled to be honest.  I'm not sure how this even happened.  I had a few different ssh sessions open the night before the system went down.  I may have inadvertently sent a shell command in the wrong window.... Here's what I've found.

-Ran memtest 10 passes - 0 errors

-Booted the machine into a windows install - fine

-set bios as per troubleshooting wiki - still booting to the above screenshot GUI

 

Out of frustration, I finally pulled the drives and booted in. It loaded properly but with the drives reporting as missing (GOOD).  i was able to set the key and but it would only boot to the normal GUI was if one particular drive was kept out. 

 

I swapped that drive to a new bay to check if it was a backplane issue but encountered similar results.  When running diagnostics, nothing was being written to the flash drive.  I hooked up the odd drive to an external enclosure and found an unraid install disk...with my diagnostics that I was just running.

 

Somehow, i managed to clone the usb boot partition to one of my array disks while the array was running....lol

 

So what should I do next?  My idea was to format the "odd" disk and boot the array degraded and rebuild it from the parity disk.  The data on the disk at this point is trashed.  Any thoughts?

Edited by depreciated_
Link to comment
2 hours ago, depreciated_ said:

My idea was to format the "odd" disk and boot the array degraded and rebuild it from the parity disk.

DON'T format anything. Format writes an empty filesystem to the disk and parity treats this write operation just like any other, so parity will agree the disk has an empty filesystem.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.