i really messed up...


mattbr

Recommended Posts

So, dinked around on a running array, touched a sata cable the wrong way, tried to follow the For unRAID v6 section of the wiki, and things went wrong.

 

The single affected drive got formatted before being added back (as an empty drive), and when I restarted a parity check to try to rebuild, the only writes were to parity (which was, prior to this, good and checked, by luck, there was a cron'd parity check overnight) and to the affected drive (in about equal numbers).

 

I've stopped the array, booted up a systemrescueCD on another machine to see if there's anything salvageable on the drive itself, the data on the drive was, I think, generally pretty much backed up offsite and not exactly essentially anyway, but getting it back would be nice, and, most importantly - how can I make sure I don't hose the other drives in the array when I start it back up ?

Link to comment

You really should have asked before you did anything.

 

The wiki article you linked was not the procedure you should have been following. And apparently you did it wrong anyway since you say parity was written.

 

What you should have done is rebuild the data disk. The disk was disabled because a write to it failed. The failed write, however, was used to update parity, so the data on the disk was invalid but the valid data was in the parity array and could have been used to rebuild the disk.

 

And possibly, the invalid disk data made the disk unmountable, so unRAID offered to format it for you. Maybe. Not clear really what happened that lead to you formatting the disk, but it's too late now.

 

Format is never part of rebuilding a disk. Format means "write an empty filesystem to this disk". unRAID treats this write just like it does any other, by updating parity. So parity now agrees that disk has an empty filesystem on it.

 

It might be possible to recover some files from the disk, but it would be simpler to restore them from backup. Before doing anything with that disk on another system it would be better to work with it in unRAID to see what can be done. If another system writes anything to that disk you will have to rebuild parity.

 

What filesystem was the disk?

Link to comment

Hey, yeah, know I should've asked... but, well, live and learn...

 

Thing is there was no option to rebuild the disk that I could find, and stopping / reassigning just led it to staying emulated.

 

It's formated in XFS. The superblock is hosed, says xfs_repair, which is running.

Link to comment

Thought I'd done the stop - unassign - start unassigned - stop - reassign dance, though clearly messed up somewhere and just went "yay ! unassigned devices says it's mounting and ok <sigh of relief>!" or something stupid.

 

I'm doing the repair from CLI on another machine - the server is shutdown, I didn't want to risk hosing the other data drives as well.

Link to comment

Thought I'd done the stop - unassign - start unassigned - stop - reassign dance, though clearly messed up somewhere and just went "yay ! unassigned devices says it's mounting and ok <sigh of relief>!" or something stupid.

 

I'm doing the repair from CLI on another machine - the server is shutdown, I didn't want to risk hosing the other data drives as well.

By doing the repair on another machine, you have invalidated parity, since any changes the other system makes to the drive will not be in parity. You will have to rebuild parity (New Config, DON'T Trust Parity) when you put the drive back in your array. Please ask for help.
Link to comment

By doing the repair on another machine, you have invalidated parity, since any changes the other system makes to the drive will not be in parity. You will have to rebuild parity (New Config, DON'T Trust Parity) when you put the drive back in your array. Please ask for help.

 

Definitely should've asked for help sooner...

 

I was mostly trying to prevent damage to the other drives in the array... hence the pulling of the "bad" drive and the shutdown of the array. Seeing the "bad" drive and parity being written freaked me out, since I figured it'd mean certainty of losing both - hence "ok, let's try to minimise risk to the rest of the data, see if anything at all can be salvaged, and take it from there" approach.

 

So, practically, it's "reconnect the drive, press the go button", the array will then boot stopped, newconfig, don't trust parity, let it do it's thing, and then start getting whatever was on that drive back on, right ? No need to pre-clear it again ?

Link to comment

At this point I would hesitate to say exactly how you should proceed since I'm not entirely clear what state your array is in at the moment. If for some reason unRAID thinks you are adding a new drive then starting the array will clear the drive. And if it thinks the drive is being replaced it will want to try to rebuild it from parity.

 

The best thing to do at this point is to go into Settings - Disk Settings before you put the drive back in and set Enable auto start: to No. Then it won't do anything until you tell it to and then you can put the drive back in and New Config.

Link to comment
  • 2 weeks later...

Ok, so, back at this (was travelling for a bit), and with a nice external drive full of mangled filenames - by the volume of data, there hasn't been much if any loss, and what I don't have in one of two other arrays, I should be able to, erm, rebuild anyway, so it isn't like I'm losing my life's work if I wipe the drive.

 

Just booted the array up, problematic drive not plugged in. It shows as missing, array stopped, everything looks to be assigned as it should (minus the problem drive obviously), disk prefs set to start with a stopped array.

 

What's the best course of action from here ? NewConfig, rebuild parity for the good drives and after that add the problematic one as an empty drive ?

 

(in terms of data rebuilding, the plan is to wait for the screwed drive to come back online, then go read the rsync docs to figure out how not to touch anything that's been added recently to make sure any files that might have been upgraded on the still-good part of the array don't get stepped back to a previous incarnation - I'd assume

 du -hs * | sort -h 

would still work to, in this case, give me the names of the empty folders, right ?)

Link to comment

What's the best course of action from here ? NewConfig, rebuild parity for the good drives and after that add the problematic one as an empty drive ?

If you are OK with having that drive empty, it would be quicker to include it in the parity rebuild. If you add it after the parity rebuild, unRAID will have to clear it.
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.