[SOLVED ] - Help a newbie - 3 drives redballed


Deryg
Go to solution Solved by trurl,

Recommended Posts

Hi all,

 

Been running unraid forever but fairly green. Starting last week, Ive had drives start reddballing. I assumed bad drives and just replaced them and rebuilt them.

Now however I looked into my server this AM when I thought all was good as I t was all finally green and working last night. seemed to be humming along perfectly. Jow however, I have 3 drives redballed so that's beyond rebuild capability of dual parity.

 

How can I proceed? IS there a way to force it to recognize these drives as good to minimize data loss? I am not sure how to proceed.

 

Edited by Deryg
tagging solved
Link to comment

to make matters worse this started as 2 redballs so I started rebuilding a drive when another dropped so I have a drive which I know is scrapped. (cause I panicked and rebooted everything via shutdown) However, is there a way to get the other 2 drives be known as good and rebuild that one?

 

Link to comment

Shut the server down.  Open up the case and make sure that all of the SATA data and power cables are firmly seated.  Then check them a second time being as careful as possible not to do touch any other cable then the one currently being checked.  (SATA cables are notarize for loosing up at the slightest disturbance.)  Avoid tying the cables together to make them 'neat' as this can put the cable under tension which can cause it to become unseated when the drive vibrates as the platters rotate.  

 

Disk #5---  I would not want this drive within two miles of any server I was using.

 

Disk #6---  This one is marginal.  I would want to run it through 3 (or 4) preclear cycles and see what happens to those Current Pending Sectors and Offline Uncorrectable numbers.

 

Disk #7--- SMART attributes look good.  That is why I am suggesting checking the cables.

Edited by Frank1940
Link to comment

I have no issue replacing those 2 drives. I mean, I guess I have horrible luck (although we did have some wicked power outages and storms recently. This will make 5 drives replaced in a week.

 

I may as well grab drives and sata cables and replace the lot. I detest the current SATA cables

 

However, it leaves me with the question of: Iis there a way to force disk 7 as good (I am almost positive it is) so that I can rebuild disks 5 and 6? currently with 3 drives dead, I can't even begin corrective measures as I have invalid configuration (too many drives missing).

 

This is where I'm stuck. How does one proceed to fix an array with too many drives gone off

 

Thanks for the assistance

Link to comment

I will ping @JorgeB as he has provided a lot of help in situations like this.   But, listen carefully, once disk has been disabled, parity (both in your case) will be updated whenever a write occurs to the array.  If a write was in progress to Disk #7 when it went off-line, the disk will probably be corrupted because the balance of the write was not finished.  How destructive the corruptions were will depend on what type of data was being written--- file_data or file_tables.

 

Be looking seriously at a UPS.  Every server should really have one...

  • Upvote 1
Link to comment
6 hours ago, Deryg said:

assumed bad drives and just replaced them

Do you have those drives? Bad connections are more common than bad disks. Though in your case that might not be true since you apparently didn't know you had bad disks 

 

You must setup Notifications to alert you immediately by email or other agent as soon as a problem is detected. Don't let one unnoticed problem become multiple problems and data loss. 

 

On mobile now will check diagnostics soon 

Link to comment
20 minutes ago, trurl said:

Save me the trouble of examining all your other disks. Do any of your disks have SMART warnings ( 👎 ) on the Dashboard page?

I went ahead and examined SMART for all the other disks while waiting on replies.

 

You have several WD disks. You should go to the settings for each of those and add attributes 1, 200 for monitoring on all WD disks. One of those have a few on attribute 1, so you will get a warning for that one after you set that up. I think it is OK to just acknowledge that for now by clicking on the SMART warning (👎) on the Dashboard page and it will warn again if it increases.

 

33 minutes ago, trurl said:

Do you have spare disks to replace those bad ones? 

You have several unassigned disks. How are you using those?

 

56 minutes ago, Deryg said:

a way to force disk 7 as good

Yes. Will wait on your return to begin discussing that.

Link to comment

I will not argue with the One Wise Man here... I would expect that you will end up recovering the data from the 'failed' disks and rebuilding parity... 

 

From your initial post it seems that you have a faulty SATA connector or controller... 

 

if you willing to rely on some of these 'failed' disks... I would expect that you could build a new array and copy the data from these disks that failed... 

  • Upvote 1
Link to comment
1 minute ago, mathomas3 said:

copy the data from these disks that failed... 

Simpler to rebuild the failed disks. Just need to get all the disks back into the array, disable the bad disks, then work with the emulated disks to see where to go from there. We've done similar many times and it often works out well.

 

Waiting on OP to rejoin the thread.

Link to comment
13 hours ago, trurl said:

Do you have spare disks to replace those bad ones? 

Nopes, going shopping today. Will pick up new drives as my drive sparing was consumed  by replacing drive 7 and 8. Will pick up 3 new drives as well as new SATA cables

 

(apologies for tardy reply. Was offline dealing with kids who lost their plex :) )

Link to comment
12 hours ago, trurl said:

Your cache disk is reiserfs. You really should put an SSD there anyway so maybe we will do that over after getting your array stable again. Disable Docker in Settings and leave it disabled until further instructions.

 

10-4 docker disabled. I saw on the reiserFS but figured I'd get to that after fixing things and getting stable. SSD drives are cheap now so no issue replacing

Link to comment
12 hours ago, trurl said:

You have several unassigned disks. How are you using those?

 

I'm not. they are old drives from various systems that I pullied data from. They're more like filler in my drive cages to balance airflow than anything else now. Plan is to use these as unassigned devices and periodically copy crucial files and docs, hten pull them and ship them off-site to the in-laws, basically rotating a backup copy of family photos etc. 

Link to comment
12 hours ago, trurl said:

Yes. Will wait on your return to begin discussing that.

I'm back online but will not have drives to begin the fix until after lunch as I need to buy some new ones. Have I mentioned how fantastic it is that you are taking the time to talk me through this and try to help? This is so appreciated, you have no idea.

Link to comment
12 hours ago, trurl said:

Simpler to rebuild the failed disks. Just need to get all the disks back into the array, disable the bad disks, then work with the emulated disks to see where to go from there. We've done similar many times and it often works out well.

 

Waiting on OP to rejoin the thread.

the problem is that I had started a rebuild on drive 6 when it all went to shit. as it was in the process of rebuilding, I assume that one is toast and unrecoverable?

 

Link to comment

Damn, i forgot today is a holiday, so no replacement drives can be bought. all the stores are closed. Array will remain offline for another day. anything I can do in the meanwhile?

 

I'm still unclear as to how to proceed as well once I do secure some new drives and cables.

 

I did reseat my card (PERC H310), and reconnected all drive cables in case I had bad connections. 

 

Link to comment
51 minutes ago, Deryg said:

the problem is that I had started a rebuild on drive 6 when it all went to shit. as it was in the process of rebuilding, I assume that one is toast and unrecoverable?

Were you rebuilding onto the same disk? Even if so, might be recoverable.

 

You are in a situation similar to many we have helped users with, sometimes with no data loss, sometimes with some data loss, or at least with some data that needs to be examined and put back in the correct place.

 

18 minutes ago, Deryg said:

today is a holiday, so no replacement drives can be bought. all the stores are closed

Federal holiday, but many stores have President's day sales.

Link to comment
  • Deryg changed the title to [SOLVED ] - Help a newbie - 3 drives redballed

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.