Jump to content
Sign in to follow this  
BR0KK

My first drive failure with unraid

15 posts in this topic Last Reply

Recommended Posts

Hi there,

I'm new to unraid.

 

Everything worked fine for a long time and i started to get compfy with my installation. 

I wanted to do a backup of my server with duplicati and got this running. It copied arround 900 GB of 2.5 TB to my NAS verry slowly ... but ok ... i can wait :/.

Yesterday i went to check on my server and it displayed an error message that one of my 8 TB drives has failed.  

 

unraid.thumb.JPG.5af02dcb65f18a3942f5a24247235171.JPGunraid2.thumb.JPG.a75309d7e23d829d77ccbaedf0f675e9.JPG

 

I'm not sure how to proceed now. As i've read unraid is currently emulating the one 8 TB drive via the parity drive and my data is safe for now.  

I need to get a spare drive today .... 

I dissabled all dockers besides krusader and there are no vms on that machine.

 

This drive is brand new and successfully did the preclear..... is there a chance the drive is fine and unraid just had a "hickup". 

 

I have a second 8 TB Data drive which currently sits empty (Picture 1, last drive) 

I know that i can install krusader and copy over the files but because im a noob  i'm not sure how to do so fasely without causing issues with the shares? ( Picrure 2)

System, nextcloud and appdata are folders created by unraid. If i go with krusader can i just copy them from disk 5 to disk 6? 

 

 

Thank you :)

 

 

 

 

 

Share this post


Link to post

Looks more like a connection problem, but since disk5 dropped offline there is no SMART report, power down, replace/swap cables on that disk and post new diags.

Share this post


Link to post

Yes i'll try that as soon as i get home.

 

Drive 5 is connected va a backplane. The same backplane all the 8 TB Drives use) 

 

Is it safe to just reboot or do i have to stop the array first (Stop the array, set it to "do not start automatically" and then powercycle the unit)?

Can i just power down the unit and restart it?

What happens to the data that is not written to the disk yet? (

 

Drive 5 is beeing emulated via the parity drive; right? What if, there was data written to the emulated drive no. 5 while beeing in this state?. 

The data isn't that important but i'd like to know more about how this works (or not) 

 

If the drive comes back online i'll assume that unraid will try to rebuild this drive from the parity drive's (does that include the data that might have been written to the emulated disk?)  

 

Thank you for the help so far :)

Edited by BR0KK

Share this post


Link to post
59 minutes ago, BR0KK said:

Drive 5 is beeing emulated via the parity drive; right? What if, there was data written to the emulated drive no. 5 while beeing in this state?. 

Any data written while the drive is being emulated will be there.

 

1 hour ago, BR0KK said:

If the drive comes back online i'll assume that unraid will try to rebuild this drive from the parity drive's (does that include the data that might have been written to the emulated disk?)  

No.   Unraid will know that it disabled that drive so will not automatically try to rebuild onto it without user intervention.

 

Note that a failed drive is not "rebuilt from parity".   It is rebuilt by reading all the 'good' drives in conjunction with the parity drive.   This is why you never want untrustworthy drives in the array as a rebuild requires that all 'good' drives are. read without error.  It is possible this is what you meant but it may also mean that you do not understand how parity works.

Share this post


Link to post

Is there a way to inspect the drive before i put it back into unraid.  Externaly, maybe via a HDD testing tool to figure out if the drive has an issue.?

 

12 minutes ago, itimpi said:

Any data written while the drive is being emulated will be there.

Does that include reboots or complete powerdowns?

 

Share this post


Link to post
3 minutes ago, BR0KK said:

Is there a way to inspect the drive before i put it back into unraid.  Externaly, maybe via a HDD testing tool to figure out if the drive has an issue.?

SMART report will be a good indication, you can also run an extended SMART test.

Share this post


Link to post
28 minutes ago, itimpi said:

Note that a failed drive is not "rebuilt from parity".   It is rebuilt by reading all the 'good' drives in conjunction with the parity drive.   This is why you never want untrustworthy drives in the array as a rebuild requires that all 'good' drives are. read without error.  It is possible this is what you meant but it may also mean that you do not understand how parity works.

I do have 2 parity disks installed (each 8 TB) wich in theory should give me 2 drive failure coverage. What do the other disks (smaller 1 TB drives and cache drives) have to do with parity or rebuilding process?

16 minutes ago, johnnie.black said:

SMART report will be a good indication, you can also run an extended SMART test.

I'll look at the smart values and do a SMART test. As disk 5 is inside a backplane, can i disconect this drive and connect it via a normal sata port on the same machine without loosing the array?

Share this post


Link to post
19 minutes ago, BR0KK said:

As disk 5 is inside a backplane, can i disconect this drive and connect it via a normal sata port on the same machine without loosing the array?

Yes.

Share this post


Link to post
39 minutes ago, BR0KK said:

What do the other disks (smaller 1 TB drives and cache drives) have to do with parity or rebuilding process

During a rebuild ALL the ‘good’ data drives are read for each sector that is being rebuilt on the failed drive.    The system uses the contents of the sectors on the ‘good’ drives in conjunction with the same sector on the parity drive to work out what must have been in the sector on the failed drive to give the right value for the parity drive.

Share this post


Link to post

Ok i have some reading to do. 

 

I restarted my server and it seems as if my backplane is dying. The drive is not recognised when i leave it in the slot it was before. 

On a diffrent port (just sata) i clould assign the drive again and now the rebuild is in progress. 8 TB will take a day to restore :/

 

Thank you for the help :) 

 

tower-smart-20191021-1936.ziptower-diagnostics-20191021-1737.zip

Share this post


Link to post

Disk look fine.

 

Forgot to mention earlier, you're using a SATA controller with or connected to a SATA port multiplier, it was showing some timeout errors in one of the connected devices in the previous diags, they are not recommended as they then to cause timeout and/or dropped devices, but they do work for some users, just something to keep in mind.

 

16 minutes ago, BR0KK said:

8 TB will take a day to restore

It also won't help with this, as performance is not as good as a dedicated HBA.

Share this post


Link to post

Yeah i know this system isn't great but it worked for a while. 

 

Everything works and for the age and configuration of my N40L the performance i get isn't realy  that bad :D

 

The N40L backplane is directly connected to the mainboard. It features a sff-8087 to 4x SATA breakout. Not a real Backplane maybe its connected internaly as a portmultiplyer?

 

The Portmultiplyer (sylba 8 port SATA)  is where all the small and the cache drives are connected to.  

 

The N40L will get its retirement eventually. New Hardware is on the way anyways; thought i try this on my N40L before investing in a more stable permanent sollution. Maybe as a backup system for my unraid  based on freenas :P (before unraid i used that os and it also worked OK on the N40L)

 

Share this post


Link to post

System is back up and working correctly after the rebuild. It even recogised disk 5 in the pseudo backplane. I'll watch this closely and as soon as the new hardware arrives ill try to move everything there. 

 

Thank you verry much for the help provided :)

Share this post


Link to post

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Sign in to follow this