UnRAID completely broken after running docker safe new permissions.[SOLVED]


Pyro

Recommended Posts

6.5.3

I have a drive that's being emulated because it's failing, and 3 others with smart errors. My junk is falling apart. I tried to fix that yesterday, but I think I made it so much worse.

 

I attempted to use the UnBalance mover to recover the files on the failing drive, and it recommended running docker safe new permissions first. I ran it, and everything broke. I couldn't see shares anymore, windows can't pick up the shares, Plex wouldn't play anything, it was bad. My first thought was to log out and back in again, but that made things go from bad to worse. Now I can't access anything. I put in my root password and get "No such file or directory" upon root login.

 

...how boned am I?

 

It was suggested on Reddit that I try reloading the OS on my USB, (except the config) but that didn't help.

JYpprvC[1].png

PQYhVeC[1].jpg

Edited by Pyro
Solved
Link to comment
  • Replies 100
  • Created
  • Last Reply

Top Posters In This Topic

Top Posters In This Topic

Posted Images

You didn't need to recover the files on the failing drive. Simply replacing the drive and letting it rebuild would probably have been the correct thing to do, depending on the health of the rest of your array. It sounds as though you have let problems mount up by either ignoring them or being unaware of them. Do you have notifications enabled? Post your diagnostics.

Link to comment
36 minutes ago, John_M said:

It sounds as though you have let problems mount up by either ignoring them or being unaware of them. Do you have notifications enabled?

Your first screenshot is showing warning indicators on 3 additional drives besides the disabled one, so possibly many things going wrong besides just the disabled disk. None of that should prevent you from logging in though so probably some corruption on flash as well.

Link to comment

When I tried to log into the webUI (chrome) it returned an error connection refused. When I try to log into the machine itself I get the second picture. "No such file or directory" and another login prompt. I'm at work now, I'll type up the full story when I go to lunch.

Link to comment

If I deserve the dunce cap here, I'll set it as my avatar. I think I did everything the best that I could.

 

I started with three 2tb "refurbished" data center drives and added a 1.5tb and two 1tb 2.5" drives. All junk I know, but it's what I had available to me. A few months ago I replaced the 2tb parity with a new WD red 8tb, added that 2tb to the array, and added a 120gb SSD for a cache drive. I removed one of the 2.5" drives at that time. Doing my best to remove the garbage I started with.

 

Then I started seeing smart errors. They were few and far between, so I kinda let it go. I get an email for every problem. I knew I needed to do something, but I couldn't just materialize replacement hard drives. On Friday I picked up a WD 10tb white label. I don't think I can just drop it in because it's bigger than the parity. My plan was to:

1. Recover the data from my failed drive into the remaining questionable-but-still-living drives

2. Replace the parity with the 10tb

3. Rebuild the parity

4. Install the 8tb

5. Transfer everything to that drive

6. Remove all of my junk drives, leaving nothing but the 10tb parity, 8tb data, and cache SSD.

 

Obviously this did not go to plan, and said plan failed at step 1. Spectacularly.

I don't know if it matters or helps, but the machine is a Celeron 3930, 8gb ddr4, h170 mobo.

Edited by Pyro
Link to comment
1 hour ago, Pyro said:

When I try to log into the machine itself I get the second picture. "No such file or directory" and another login prompt.

That sounds like corruption of the passwd file, as seen in another couple of threads recently. You say that you re-created your boot flash from scratch but used a backup of your config folder? Please go to the backup and post the file called passwd from inside the config folder. It won't give any passwords away!

 

30 minutes ago, Pyro said:

Then I started seeing smart errors. They were few and far between, so I kinda let it go.

Some SMART errors you can live with for a while, such as those caused by bad cables. Others compromise your ability to rebuild disks from parity. Either way they are there on the Dashboard for you to see and check up on.

Link to comment
12 minutes ago, John_M said:

You say that you re-created your boot flash from scratch but used a backup of your config folder?

Sorry, I've just re-read your OP and you don't explicitly mention a backup of your config. So get the passwd file directly from the config folder on your flash. You'll have to move it to a PC to read it.

Link to comment
51 minutes ago, Pyro said:

My plan was to:

1. Recover the data from my failed drive into the remaining questionable-but-still-living drives

2. Replace the parity with the 10tb

3. Rebuild the parity

4. Install the 8tb

5. Transfer everything to that drive

6. Remove all of my junk drives, leaving nothing but the 10tb parity, 8tb data, and cache SSD.

That sounds like a good plan. Another possibility would be to create a new array with the new parity and only the new data disk, and then try to transfer the data from the old junk disks using the Unassigned Devices plugin. Possibly some other scenarios. Once we get to a place where we can get diagnostics we will have a better idea.

 

Do you have backups of anything important and irreplaceable?

Link to comment
19 minutes ago, John_M said:

Sorry, I've just re-read your OP and you don't explicitly mention a backup of your config.

I have it backed up. I'll upload the password file as soon as I get home.

 

14 minutes ago, trurl said:

Do you have backups of anything important and irreplaceable?

Only a few files are irreplaceable, but it would save me weeks of work if I can recover everything.

Edited by Pyro
Link to comment

Have you tried a fresh install? Just backup your config folder somewhere, format flash and put a new clean install on it, don't restore any of your config except your .key file, and boot up. Then we can see what from your config we can reuse. All of your settings are in config, and they are mostly just text files (like most configuration files in Linux) so if you can get that far maybe we can just work through your setup again. And with diagnostics we can decide what to do about your disks.

Link to comment

Here is the diagnostic file. I put my Plus.key but not my Trial.key file back onto the flash drive. I'm showing unregistered, so I'm assuming I messed that up. Hopefully an easy fix.

tower-diagnostics-20181128-0705.zip

 

It's janky, but I have a computer running teamviewer on the network so I can do anything remotely that I need to while I'm at work. This computer has my config backup. I can also come back home at lunch to do something physically, if need be.

Edited by Pyro
Link to comment
25 minutes ago, Pyro said:

I put my Plus.key but not my Trial.key file back onto the flash drive. I'm showing unregistered, so I'm assuming I messed that up.

Not sure but your syslog suggests it isn't finding a .key file. The .key file must be put in the config folder of flash.

 

Also, the .key file can only be used on the exact USB flash drive it was registered to. Did you use that same USB flash drive for the new install?

 

If you have all that correct, then perhaps the .key is corrupt like possibly other things were on your flash. Do you have another copy of the .key, perhaps in an email?

Link to comment

Looks like Squid has identified your .key problem. It is in the wrong place.

 

Since none of your disks are assigned, I can't associate their SMART with the Dashboard screenshot you gave so I will just identify them by their serial number.

 

Serial Number:    WD-WCAVY5373109
ID# ATTRIBUTE_NAME          FLAGS    VALUE WORST THRESH FAIL RAW_VALUE
  1 Raw_Read_Error_Rate     POSR-K   200   199   051    -    99
  5 Reallocated_Sector_Ct   PO--CK   199   199   140    -    8
196 Reallocated_Event_Count -O--CK   192   192   000    -    8
197 Current_Pending_Sector  -O--CK   200   200   000    -    96
200 Multi_Zone_Error_Rate   ---R--   200   167   000    -    2

That disk needs to be replaced

____________________________

Serial Number:    WD-WCAVY6439141
ID# ATTRIBUTE_NAME          FLAGS    VALUE WORST THRESH FAIL RAW_VALUE
  1 Raw_Read_Error_Rate     POSR-K   199   198   051    -    57451
  5 Reallocated_Sector_Ct   PO--CK   187   187   140    -    97
196 Reallocated_Event_Count -O--CK   171   171   000    -    29
197 Current_Pending_Sector  -O--CK   200   200   000    -    6
198 Offline_Uncorrectable   ----CK   200   200   000    -    4
200 Multi_Zone_Error_Rate   ---R--   199   199   000    -    313

And that disk needs to be replaced

_______________________________

 

There is also a disk ST31500341AS_9VS05DG9 that isn't reporting SMART. Perhaps a bad connection, perhaps something worse. You might check its connections and see if you can get us a SMART for it by clicking on the disk and going to its Attributes.

 

Probably a rebuild from so many disks with so many problems will not be good, so maybe the alternative plan I suggested is the way forward.

21 hours ago, trurl said:

create a new array with the new parity and only the new data disk, and then try to transfer the data from the old junk disks using the Unassigned Devices plugin.

 

Link to comment

Oh, oops. I guess I was in to much of a hurry. I'll fix the .key at lunch.

 

I moved the machine, so it's possible that a SATA cable came loose. I'll also check those. If my assumption is correct, those 3 drives that you listed are the same 3 with yellow warnings on my original screenshot, and the emulated drive isn't showing up at all. There should be three WD 2tb drives and a single 1.5tb Seagate.

 

With that said, since the 8tb drive is currently my parity, wouldn't mounting it as the array drive make the data on the (currently) emulated drive completely inaccessible?

 

I'm obviously not super good at this stuff, but if I can get my array back to the broken-but-still-usable state it was in, can I go back to my original plan?

Edited by Pyro
added info, typo
Link to comment
1 hour ago, Pyro said:

With that said, since the 8tb drive is currently my parity, wouldn't mounting it as the array drive make the data on the (currently) emulated drive completely inaccessible?

If you have anything that is important and irreplaceable on any of your disks including the emulated one then you should try to copy those files to ANOTHER system before doing anything else. Don't try to write anything else to the array.

 

I don't think the other disks are good for trying to rebuild the emulated disk, or for building a new parity with. So after you have copied the important stuff then I am inclined to forget about the current array and start over. With the 10 TB parity disk and the 8TB data disk mentioned in your original plan, create a new array then mount each of the junk disks with Unassigned Devices and copy whatever you can from them to the new array.

 

 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.