Skip to content
View in the app

A better way to browse. Learn more.

Unraid

A full-screen app on your home screen with push notifications, badges and more.

To install this app on iOS and iPadOS
  1. Tap the Share icon in Safari
  2. Scroll the menu and tap Add to Home Screen.
  3. Tap Add in the top-right corner.
To install this app on Android
  1. Tap the 3-dot menu (⋮) in the top-right corner of the browser.
  2. Tap Add to Home screen or Install app.
  3. Confirm by tapping Install.

Drive Replacement Gone Bad....

Featured Replies

Greetings,

 

I was attempting to swap out a 250Gb drive for a shiny new 750Gb drive (drive 9).  Followed the instructions and it all was going well - until it got stuck at 20.6% complete.

 

See attached file for screenshot.  

 

So it looks like Drive 6 has died during the data-rebuild process.  Drive 9 is the new drive.  

 

Just wanted to confirm my options with the folks here at the forum.  It would appear that the data on Drive 6 will be lost.  I have the original drive 9 and can reinstall to the Unraid but given that the data-rebuild process has begun, it will not be helpful to restore the contents of drive 6 - right??

 

Assuming that I cannot recover the data on drive 6, is the best course of action to remove the faulty drive and initialize Unraid as new to regenerate parity?  I figured that I could mount the legacy drive 9 on my PC and just copy the contents to the Unraid once the initialization is complete.

 

Thanks for any help....  Bummer but at least the content of the lost drive can be recreated as it was primarily video/music.

Greetings,

 

I was attempting to swap out a 250Gb drive for a shiny new 750Gb drive (drive 9).  Followed the instructions and it all was going well - until it got stuck at 20.6% complete.

 

See attached file for screenshot.  

 

So it looks like Drive 6 has died during the data-rebuild process.  Drive 9 is the new drive.  

 

Just wanted to confirm my options with the folks here at the forum.  It would appear that the data on Drive 6 will be lost.  I have the original drive 9 and can reinstall to the Unraid but given that the data-rebuild process has begun, it will not be helpful to restore the contents of drive 6 - right??

 

Assuming that I cannot recover the data on drive 6, is the best course of action to remove the faulty drive and initialize Unraid as new to regenerate parity?  I figured that I could mount the legacy drive 9 on my PC and just copy the contents to the Unraid once the initialization is complete.

 

Thanks for any help....  Bummer but at least the content of the lost drive can be recreated as it was primarily video/music.

There is a way to recover... as long as you have the original drive9 to re-install...

 

The basic game plan will be this:

 

1. Before you reboot, or do anything else, log in via telnet, grab a copy of the syslog so we have a better idea of exactly what is happening. (attach it to your next post) Instructions are in the wiki on how to do this here.

2. Grab/Print a screen-shot of the "Devices" page. (For your own records)

3. Log in via telnet and make a copy of the entire "config" folder by using the following command (it will copy the entire folder to one named for today):

cp -r /boot/config /boot/config20090430

4. Next... if you can, if the management console is still functional, stop the array.

5. Then, power down

6. Take the shiny new drive out of disk9 and put back in the old 250Gig drive.  This drive has not been written to and should still have your data.

7. Take a new drive somewhere between 250Gig and 750Gig (it can even be the one you were going to use to upgrade disk9,) and use it to replace the disk in slot6.

8. Power up.  Odds are the server will notice the change in drive and not start.  If it does start, stop it immediately.

Go to the devices page, assign the old (original) disk9 to disk9, assign the new replacement drive to disk6. (Now the new drive is going to be used to replace a failed disk6 instead of a working disk9)

 

Now, we need to get the server to come back on-line with it thinking drive6 has failed, and that drive9 is valid.  That way, it will use parity and all the other data drives to rebuild disk6 (The drive that has actually failed) onto the replacement drive.  To do that requires a very special set of steps.

 

A.  Press the "Restore" button, but DO NOT "Start" the array just yet.   All the drives indicators should turn blue.  The array status will be "Stopped - Initial Configration"

 

B. Now, log in via telnet, or on the system console, and type two commands

cd

mdcmd set invalidslot 6

 

It should respond with:

cmdOper=set
cmdResult=ok

 

The prior command at the telnet/console prompt will tell the server that it is disk6 that needs reconstruction, and that parity should be trusted.   (Without you telling the server that disk6 is the bad drive, the array would throw away your parity and start re-computing it based on the new config.  This would cause the loss of disk6's data)

 

C.

Now, once you have typed the mdcmd set invalidslot 6, and have seen its response, you can press the "Start" button.  You should then see disk6 being written to, and all the other drives being read.

 

Once it is completely re-constructed, you should have everything back as you wanted, with all the data  The re-construction will take a number of hours... much like a parity check.

 

If you have any questions about this procedure...  ask first... before you do anything that will invalidate your current parity.   If you press "Restore" and did not tell it disk6 was bad, odds are you will lose disk6's data.   Normally the "Restore" button only sets an initial configuration and tells the server that the parity drive needs to be rebuilt.  (it normally invalidates parity)

 

By doing the steps outlined above... you should be able to recover all of disk6.

 

Joe L.

 

 

BTW, this brings up something I have been meaning to mention....

 

When upgrading a drive, don't use the array until it's finished.  Consider killing Samba and NFS so that no one can map or write to the drive over your network.

 

If you don't change any data on any other drives in the interim, you can recover from a drive failure during the upgrade process. 

  • Author

I have attached the syslog (zipped as it was 12MB) and have a screenshot of the devices page. 

 

I am no longer getting a response from the server so I am not sure if it stopped gracefully or not.  The telnet session is still responsivess - should I issue a shutdown command?

 

Appreciate the timely response to this post.

 

Thanks,

Kevin

I have attached the syslog (zipped as it was 12MB) and have a screenshot of the devices page. 

 

I am no longer getting a response from the server so I am not sure if it stopped gracefully or not.  The telnet session is still responsivess - should I issue a shutdown command?

 

Appreciate the timely response to this post.

 

Thanks,

Kevin

Yes, you can easily see the errors in the syslog... Tons of them when trying to read the failed disk.

 

See if the emhttp process is still running.  It might be, or it might have been killed off as the syslog used up all the available space in memory.

 

Now that you captured the syslog, we can power down as I said.  Make the copy of the "config folder now is you have not yet done so before you power down.

 

You can "try" to take the array off-line cleanly by typing the following series of commands:

cd

killall smbd nmbd

sync

for disk in /mnt/disk*

do

  umount $disk

done

 

mdcmd stop

 

Then you can power down by typing:

 

poweroff

 

Of course, if you have the "powerdown" add-on package installed, you can just type:

powerdown

as it does all the above individual commands for you.

 

If a drive is unable to be un-mounted, it is probably "busy" (has an open file, or is the current directory for some process) and you will not be able to cleanly stop the array.  I would not let that worry you too much, since you will be forcing the array to think parity is good anyway later when you tell it disk6 is the one that is invalid.

 

Joe L.

  • Author

Performed the steps are directed and it appears to be rebuilding drive 6.    As you said, it may take a while to reconstruct (~900 minutes).

 

Thanks again for the accurate instructions and rapid response.

 

Kevin

 

Performed the steps are directed and it appears to be rebuilding drive 6.    As you said, it may take a while to reconstruct (~900 minutes).

 

Thanks again for the accurate instructions and rapid response.

 

Kevin

 

Let us know how it works out...  Glad it is able to save you some effort. (Re-ripping 250Gig of media is not fun)

 

Might I suggest several things once you get back stable.

 

1. Monthly full parity checks... Not to check parity, but to detect bad drives and let the SMART firmware on the drives work at fixing things they can fix.

2. Upgrade to 4.5-beta4  (it is very stable, and provides a fair amount of fixes since your version)  4.5-beta5 is due out shortly (I'll bet Tom is waiting till May 1st so he does not blow out his monthly bandwidth allotment, but that is tomorrow... so hopefully it will out soon.)  The upgrade only involves replacing two files on your flash drive.  You can even re-name the existing ones in case the new version has any issues.  You don't need to re-configure, or to re-format, just download the new release, unzip, copy two files to the flash drive, and reboot.

 

Joe L.

Just had to say, Excellent work, Joe!

  • Author

Rebuild just completed successfully.  Thanks again for the help!!!

 

I will follow up on the remaining recommendations after a day or two of trouble free operation.

 

Kevin

Archived

This topic is now archived and is closed to further replies.

Account

Navigation

Search

Search

Configure browser push notifications

Chrome (Android)
  1. Tap the lock icon next to the address bar.
  2. Tap Permissions → Notifications.
  3. Adjust your preference.
Chrome (Desktop)
  1. Click the padlock icon in the address bar.
  2. Select Site settings.
  3. Find Notifications and adjust your preference.