Jump to content

[SOLVED] Bad day. I think I'm going to throw up!!


Recommended Posts

I had a very bad day yesterday. I thought I had lost everything in my 29tb array. I really thought I was going to be physically sick. I couldn’t get to sleep last night for hours.

 

This morning I’m beginning to think it’s maybe not that bad.

 

Long story short, for the past week I’ve been setting up my new unRaid server. I migrated two other machines onto this new one (my previous unRaid machine and a windows machine running Plex, arrs, UniFi, etc). Yesterday morning, after a fresh parity rebuild, I unassigned one drive from the array to use as an unassigned device. After this, I did a new config to remove the drive from the array. I tinkered around for the rest of the day getting my dockers configured properly, I’ve got maybe 20-25 dockers. When I got home from work I did a clean shut down and then replaced the power supply and all of the sata cables. After a weeks’ worth of work I was finally 100% done and ready to place it into my rack and power up. I figured I was 100% done at this point and would run a new parity build. Buttoned everything back up and turned on the power.

 

First boot I had a series of beeps but I’m running headless so I didn’t see anything from bios. All subsequent boots have sounded normal. I plugged in a monitor but bois seems to report no errors at this point.

unRaid boots but reports drive 1 is disabled, unmountable, no file system.

All of my dockers are gone.

I did some digging around this morning and it seems drive 1 superblock is corrupt/missing/unreadable. I hope this means the drive is physically OK with a chance to at least recover the data.

I replaced all of the sata cables with the previous ones just to eliminate that as a possible failure point.

I can still access my shares over the network, but I’m sure any files that were located on drive 1 are not shown.

Through unRaid, I can browse the contents of the other drives in the array. Thankfully my plex metadata is on the unassigned device so I can save that. I found my docker image so I think that is good. I found my appdata folder and it seems to be intact, so I think that is good.

 

My first instinct, before I go messing about trying any type of repairs, is to grab a 12tb external and pull as much data down as I can just to preserve things in case I mess something up further. I think my total used space was around 10-11tb so a 12tb should be big enough to hold any data left and I can shuck it and add to the array after this is all over.

 

So I’m thinking I have a few options at this point.

 

Somehow repair the superblock, reboot, unRaid sees the drive, rebuilds parity. Life is good.

Replace or reformat drive 1?? Add the unassigned device back to the array and use the parity to rebuild. Since the last completed parity was yesterday morning with a different config would this even work? I realize I would have to recreate everything I did during the day yesterday, but with the appdata folder on hand that shouldn’t be too hard.

Backup data to external drive, replace drive 1, wipe all other drives and start over from scratch? This would take me forever, but it’s doable.

 

Assuming I get things fixed and drive one back into service, is this drive safe to use? Even if I get it working again, should I throw it away just to be safe?

 

Do you thing changing the PS and sata cables had anything to do with this?

 

I’ve attached my diagnostics files. I’m hoping someone is able to offer some advice on where I should go from here. I will be truly grateful for any help offered. Thanks.

unraid-tower-diagnostics-20210211-0439.zip

Link to comment

Thank you Jorge.

 

When you say parity disk is invalid, does that just mean the parity is not current? The disk/file system is still good right?

 

When I ran my last parity sync my config was

 

Parity disk

Array disks 1-6

Cache disk

 

My current config is

 

Parity disk

Array disks 1-5

Cache disk

Unassigned device disk

 

So I need to do a new config to match what was in the array when the last sync occurred correct? And I should backup the unassigned device before I place it back into the array correct?

 

I've got an external drive being delivered today. I still think I will pull down everything I can before I start messing around.

Link to comment
10 minutes ago, chrishick said:

One more thing. If the file system is damaged, do I need to format first or will the parity re-sync take care of that?

Not sure I understand the question?   

 

A format is used to erase the current contents of the drive.  A parity resync just gets parity back into sync with the current status of the disks and has nothing to do with format. 

 

Neither approach will restore the disk1 contents.  If you are lucky a file system repair on the disk1 might be able to do something but that has nothing to do with either format or parity resync.

Link to comment
40 minutes ago, JorgeB said:

There are two invalid disks (parity and disk1), so disk1 can't be correctly emulated, disk itself looks OK, so if you do a new config with all existing disks and re-sync parity all data should come up fine, if it doesn't please post new diags.

Only do the above with your current disk assignments and post new diagnostics after.

 

53 minutes ago, chrishick said:

I really thought I was going to be physically sick.

Do you not have backups? Parity is not a backup. You must always have another copy of everything important and irreplaceable on another system.

Link to comment
33 minutes ago, itimpi said:

Not sure I understand the question?   

 

A format is used to erase the current contents of the drive.  A parity resync just gets parity back into sync with the current status of the disks and has nothing to do with format. 

 

Neither approach will restore the disk1 contents.  If you are lucky a file system repair on the disk1 might be able to do something but that has nothing to do with either format or parity resync.

Maybe I'm not understanding how this works, but I thought I could re-build a failed drive from the parity? Is that called a re-sync, or is it called something else?

 

My question is, if I want to rebuild drive 1 from the parity, but drive 1 has a damaged file system, does the parity resync restore the file system in the process of rebuilding the disk?

 

Sorry, not trying to be difficult, I really thought I understood how unraid works, I've been using it for a few years, but I've never had a drive problem before. I do appreciate the help. 

Link to comment
34 minutes ago, trurl said:

Only do the above with your current disk assignments and post new diagnostics after.

 

Do you not have backups? Parity is not a backup. You must always have another copy of everything important and irreplaceable on another system.

So leave the current config as is (5 disks) resync parity, and this should theoretically bring drive 1 back online?

 

I have some backups, but not as much as I should have I'm realizing. It's still going to take me a lot of time to rebuild. None of this data is irreplaceable, but I'll be reevaluating my backup strategy once I'm back online.

Link to comment
5 minutes ago, JorgeB said:

You can, but your parity is invalid:

 


            [name] => parity
            [device] => sdb
            [id] => WDC_WD120EMAZ-11BLFA0_8CJX0N9F
            [size] => 11718885324
            [status] => DISK_INVALID

 

There should be a yellow icon next to it.

There is an orange icon that says "Parity is invalid" I thought that meant the parity info written to the drive was out of date and didn't reflect the current state of the array. I thought the parity drive itself was still operable and writable and you could still rebuild a drive from the stale parity, you would just loose any data written to the array since the last parity sync.

 

On the dashboard it says "Parity is degraded: 1 invalid device". I believed that 1 invalid device was referencing the drive that was removed from the array yesterday morning. I was thinking if I replaced that device into the array then I could rebuild.

 

This is so confusing, but I'm learning slowly (very).

Link to comment

I'm connected through VPN from work just digging around to see what's what. All of my network shares appear intact. I looks like all of my media files are still there.

 

In unraid, the red X icon for disk 1 tells me "Device is disabled. Contents emulated" So even though I thought my parity was bad, maybe it's not?

 

Still not sure why all my dockers are gone though.

Link to comment
2 hours ago, JorgeB said:

You can see all the data except the data on disk1, as already mentioned disk1 can't be correctly emulated due to parity being invalid.

Yeah I think you are right, but I honestly can't think what is missing. I think my array was around 11tb total and it is now 7.87 so I'm missing about 3tb total. I'm backing up my media right now and there are 472 movies, i think I had just over 500 i think, so maybe missing a few movies.

 

What I know that I do have.

appdata folder

Plex data folder

docker.img

Maybe 80-90% of my media

 

The only other thing I might need would be my docker templates which I'll try to find and backup today. I can't think of anything else that is important and wouldn't be easy to replace.

 

I'll be backing up data today and then I will start with repairs as recommended above. Fingers crossed.

Link to comment
8 minutes ago, chrishick said:

Is there any kind of allocation table or file that tells you file x is stored on drive 1 and file y is stored on drive 2?

 

I'm curious to see what files were actually stored on drive 1. It would help me to replace lost content.

 

Unfortunately not :(

 

You could use the User Scripts plugin to periodically run a command of the form

 

ls -R > outputfile

 

but that could be a lot of output on a large array with possibly millions of files

Link to comment
8 hours ago, chrishick said:

What I know that I do have.

appdata folder

Plex data folder

docker.img

Those should all be on cache if you have things configured ideally.

 

8 hours ago, chrishick said:

my docker templates which I'll try to find and backup

Docker templates are on the boot flash. You should always keep a current backup of boot flash.

Link to comment

OK, that was too easy! Take array offline, create new config, restart array. Disk 1 is back online!! It took literally 60 seconds. Once the parity rebuilds I'll be 100% again.

 

I wasted 2 days copying everything off to an external, but since I'm not as savvy as most unraid users, I just wanted to be safe before I did anything else.

 

Thank you to everyone who replied!!

 

One more problem. All of my dockers are still gone for some reason. I could reinstall all of them but I'm wondering if there is an easier way to get them back.

Link to comment
  • JorgeB changed the title to [SOLVED] Bad day. I think I'm going to throw up!!

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...