Skip to content
View in the app

A better way to browse. Learn more.

Unraid

A full-screen app on your home screen with push notifications, badges and more.

To install this app on iOS and iPadOS
  1. Tap the Share icon in Safari
  2. Scroll the menu and tap Add to Home Screen.
  3. Tap Add in the top-right corner.
To install this app on Android
  1. Tap the 3-dot menu (⋮) in the top-right corner of the browser.
  2. Tap Add to Home screen or Install app.
  3. Confirm by tapping Install.

[SOLVED] My first Failed Drive experience

Featured Replies

Over this long weekend I thought I'd have some fun and update my unRAID system from 5.0-rc8a to 5.0-rc10. But, before I could even get started I found a red ball next to one of my data disks (Drive 1). That got me started reading the forum more than I usually do!

 

This was my first experience with a failed drive and I soon realized the advantage of having a PRE-CLEARED spare disk on hand....oh, that should read my bad for NOT having a pre-cleared disk on hand! It was time to go visit my nearest Best Buy. At this point I thought I should be on the safe side and take the array off-line and just turn the server off for awhile. Family circumstances were favorable to this option.

 

I picked up a 2TB Seagate disk, brought the server back up, array off-line, and started a pre-clear on it. About 5 hours into it I got a message the procedure failed because the system was not able to write to the MBR (or words to that effect). Bummer! OK, let's read the forum some more and sleep on it. I decided I may have had one of those connector type problems with the first pre-clear so I should make some changes to the routine before trying it again.

 

The next day and a second trip to BB and I had a 2TB WD Green drive. I used a different SATA port, data cable and power adapter and after a mere 26.75 hours I had a honest-to-goodness--ready-to-to SPARE drive.

 

OK, so the procedure is.....1. array off line....check! 2. power down system and replace failed drive...check! 3. Bring system up and rebuild should start....Say What?

 

I was somewhat surprised to see the array had started, I had a red ball next to drive 1 with something like "Drive Missing" next to it. At this point I felt like I was on my own and I might have to improvise a little. I then took the array off-line and I was able to assign Drive 1 to the WD spare I had installed, and now I got a blue ball next to Drive 1.

 

(My memory is a little fuzzy on these next few points, sorry I didn't document it better)

 

(I think) I brought the array up and the blue ball went to orange indicating data inconsistency. What to do?

 

I might have taken the array off-line and the orange ball remained, but I felt it was time to "tick the "I'm sure" checkbox, and press "Start will bring the array on-line, start Data-Rebuild, and then expand the file system."

 

YES! Things started looking good, I was seeing the words, "Data-Rebuild in progress." It was going to take 4-5 hours to rebuild. I also noticed SABnzb and Sickbeard were running and I though it was a good idea to shut them down and leave them off during the rebuild. I did that in SimpleFeatures' GUI.

 

The array is back to healthy with all green balls. I'm taking an extra step to do a "read-only" parity check (I thought I read that somewhere here) and see where we go from there. I think if that passes tomorrow night I'll do another parity check and take the option to write changes to the parity drive because it had been close to 30 days since my last check.

 

I'm not quite out of the woods yet. I hope this all finishes well. I want to upgrade the parity drive to something larger and oh yes, get 5.0-rc10 installed!

 

So, it seems like the version of unRAID I've been running doesn't quite follow the way the FAQ's are written. Did I do the right thing?

 

Cheers!

As long as the array has a parity disk then it can be used even though a disk as failed as the parity allows the failed disk to be 'emulated'.  However another disk failing would lead to lost data so you want to fix the failed disk as soon as possible.

 

You are correct in that the next step is to pre-clear a disk ready to replace the failed disk.  Whether you have the array stopped during the pre-clear process is up to you.  It is also a good idea to keep the disk that was red-balled in its current state as in an emergency you should be able to get all (or almost all) the data off it in most cases.

 

When you first assign the pre-cleared disk to the array it will have the blue icon so that is expected behaviour.  When you now start the array it should go to the amber icon indicating data inconsistency.  At this point the rebuild process is activated to get the disk back into normal operation, ad when that completes successfully the icon will return to the green-ball. 

 

I do not think there is any significant discrepancy in these steps between what the documentation describes and what you experienced, but if you think there is it is worth pointing it out to see if the documentation can be improved.

 

When the array is back in normal operation you may want to try and pre-clear the disk that had been indicated as failed. It is possible that the disk is actually OK, but just that a write failure (for whatever reason) happened and this is enough for unRAID to red-ball the disk.  However check the results of the pre-clear carefully to see if it indicates the disk is going bad, or if it is going to be safe to re-use it if needed.

  • Author

Thanks itimpi for confirming the findings. I think my expectations were more along the lines of the rebuild starting automatically. I have other experience with Dell PERC controllers and RAID5 arrays where the drives can be hot-swapped and the controller will detect the new drive and start the rebuild.

 

However, I see the value in the way unRAID handled it. With my system so far, all I've had to do is parity checks and one 5.0-rc update. With this drive failure and my last "rc" update I got the impression if you took the array off line it (might) stay in the off-line state upon restart. But in both cases the array mounted at restart.

 

Thanks also for the suggestion on improving the documentation. The experience also brought forth the advantages to having a dedicated pre-clear station in utilizing an spare computer to run pre-clear away from the production unRAID server.

hi there, i don't think its just you!  ironically i also had my first red ball this past weekend, and after replacing the drive, i didn't realize that you had to select the new drive into the old slot...

 

i guess its obvious in a way, but it wouldn't be bad for the wiki to mention this exact step!

 

hi there, i don't think its just you!  ironically i also had my first red ball this past weekend, and after replacing the drive, i didn't realize that you had to select the new drive into the old slot...

 

i guess its obvious in a way, but it wouldn't be bad for the wiki to mention this exact step!

Older versions of unRAID used the disk controller port ID to tie a slot in the array to a given disk.  If you used the same port, it figured it out and you did not have to assign it on your own.

 

Newer versions of unRAID use the disk model/serial number to tie a slot to a disk.  Because of this it has no way to know which disk is replacing another.  You must therefore select the disk yourself.  This change from older to newer disk association occurred in the mid 5.0-beta series.

 

The wiki (especially the "official" version) is woefully out of date in many areas.  The un-official version is much closer, as it can be edited by any member of the forum.  If you find something in the unofficial version where you can improve it for the people who follow, edit away.  Unfortunately, only Tom @ lime-tech can change the "official" version.

 

Joe L.

 

 

  • Author

Joe L. thanks for (post)clearing :D that up! My system is back to running well. This forum is a great source of knowledge and experience!

 

The replacement Seagate drive I purchased, that failed it's initial pre-clear run, passed its second run-through on my cobbled together PreClear Workstation. Which reinforces the connector-itis theory (most likely a bad power splitter). I plan to also put the original failed disk through a pre-clear after things stabilize a little more.

 

I'll look over the unOfficial Wiki and see if I can work some of this into the doc. Thanks again!!

Archived

This topic is now archived and is closed to further replies.

Account

Navigation

Search

Search

Configure browser push notifications

Chrome (Android)
  1. Tap the lock icon next to the address bar.
  2. Tap Permissions → Notifications.
  3. Adjust your preference.
Chrome (Desktop)
  1. Click the padlock icon in the address bar.
  2. Select Site settings.
  3. Find Notifications and adjust your preference.