HKR Posted September 13, 2015 Share Posted September 13, 2015 Hello, I am running unRaid6, and it seems one of my 4TB data disk is gone bad, i am going to try and copy all the data from it to a backup disk outside of the array and send the disk for RMA. Please let me know if this procedure i plan on following is correct. 1) Copy data over to backup disk 2) Stop the array, remove the bad disk 3) start the array and continue using it till the replacment disk arrives from RMA 4) Copy over the backed up data back to the array after putting in the new disk. Thanks Link to comment
itimpi Posted September 13, 2015 Share Posted September 13, 2015 Hello, I am running unRaid6, and it seems one of my 4TB data disk is gone bad, i am going to try and copy all the data from it to a backup disk outside of the array and send the disk for RMA. Please let me know if this procedure i plan on following is correct. 1) Copy data over to backup disk 2) Stop the array, remove the bad disk 3) start the array and continue using it till the replacment disk arrives from RMA 4) Copy over the backed up data back to the array after putting in the new disk. Thanks Strictly speaking steps 1 and 4 are unnecessary (although they do not do any damage and provide s level of resilience against further failures). When the replacement disk arrives, you should just be able to assign it as disk4 and let unRAID rebuild disk4 and if nothing goes wrong the contents will still be there. BTW: What makes you think the disk has gone bad? Many times a disk is 'disabled' by unRAID because of an external factor (cabling/power). A SMART report for the disk could allow for some analysis of whether this is the case. Actually the easiest thing would be to provide the standard diagnostics (Tools->Diagnostics) as that includes SMART reports as part of the contents. Link to comment
HKR Posted September 13, 2015 Author Share Posted September 13, 2015 This is actually a new disk, its barely 3months old and at every weekly parity scan the disk kept popping up errors. Today there were several errors and the parity check got aborted and now the disk is in disabled state. i have attached the report requested. WDC_WD40EZRX-00SPEB0_WD-WCC4E4XV65XT - 4 TB (sdh) is the disk throwing up errors. tower-diagnostics-20150913-2301.zip Link to comment
itimpi Posted September 13, 2015 Share Posted September 13, 2015 This is actually a new disk, its barely 3months old and at every weekly parity scan the disk kept popping up errors. Today there were several errors and the parity check got aborted and now the disk is in disabled state. i have attached the report requested. WDC_WD40EZRX-00SPEB0_WD-WCC4E4XV65XT - 4 TB (sdh) is the disk throwing up errors. The SMART report for that disk is empty which suggests that it has dropped offline. Looking at the syslog it is not obvious whether the it was disk5 itself or an external factor that caused the initial failure. Link to comment
HKR Posted September 13, 2015 Author Share Posted September 13, 2015 Allright, how do it go about enabling the disk to get a SMART report then? Or what should be my next move? Link to comment
itimpi Posted September 13, 2015 Share Posted September 13, 2015 Allright, how do it go about enabling the disk to get a SMART report then? Or what should be my next move? One normally has to reboot the system if a drive has dropped offline. Link to comment
JonathanM Posted September 13, 2015 Share Posted September 13, 2015 Allright, how do it go about enabling the disk to get a SMART report then? Or what should be my next move? One normally has to reboot the system if a drive has dropped offline. Sometimes even a complete power off is necessary to get things back in sync. Link to comment
trurl Posted September 13, 2015 Share Posted September 13, 2015 Check power and sata connections to drive. Link to comment
HKR Posted September 14, 2015 Author Share Posted September 14, 2015 I tried restart/power off the system but the drive still remains in disabled state... I have attached a new diagnostics report, it shows the SMART report of the drive in question, please have a look. tower-diagnostics-20150914-1340.zip Link to comment
itimpi Posted September 14, 2015 Share Posted September 14, 2015 I tried restart/power off the system but the drive still remains in disabled state... Once a disk has been disabled by unRAID it will stay disabled until you rebuild it. If you think that the disable was due to an external factor and the drive is OK, then you can rebuild back onto the same drive. I have attached a new diagnostics report, it shows the SMART report of the drive in question, please have a look. The SMART report for that drive looks fine in this latest report. This raises the question of why the disable happened in the first place? It could be a cabling/power/controller type issue rather than the drive itself. The fact that it seemed to take a power off/on to bring the drive back online suggests that the controller went down for some reason. Link to comment
HKR Posted September 15, 2015 Author Share Posted September 15, 2015 Ok, So i will check all the cabling and connectors, then should i do a parity check to get the drive back online? I tried to copy over about 2.5TB of data to a spare backup drive and it went just fine without any errors, which makes me think the drive is just fine as you said earlier. Link to comment
HKR Posted September 15, 2015 Author Share Posted September 15, 2015 Also, forgot to add, the option to start a parity check is no longer visible! Link to comment
itimpi Posted September 15, 2015 Share Posted September 15, 2015 Also, forgot to add, the option to start a parity check is no longer visible! You cannot run a parity check on a system with a disabled drive. The way forward is to rebuild the disabled drive to get back to a protected state. Link to comment
HKR Posted September 16, 2015 Author Share Posted September 16, 2015 Can you please tell me how i can do that? Link to comment
garycase Posted September 16, 2015 Share Posted September 16, 2015 Ok, So i will check all the cabling and connectors, then should i do a parity check to get the drive back online? You can't do a parity check with a disabled drive. (The option won't even be available) ... I tried to copy over about 2.5TB of data to a spare backup drive and it went just fine without any errors, which makes me think the drive is just fine as you said earlier. You were copying from the emulated drive (i.e. the drive's contents were being recreated by reading ALL of the other disks in the array plus parity, and emulating the failed drive. You were NOT reading from the failed drive. Link to comment
garycase Posted September 16, 2015 Share Posted September 16, 2015 Can you please tell me how i can do that? Assuming you've replaced and/or reseated the cables, and the drive is now "seen" okay, you need to do the following: (a) Stop the array and unassign the drive from its current slot. (b) Start the array -- you will now see a "missing" drive. © Stop the array and assign the drive to the empty (missing) drive slot. (d) Start the array and let it do the rebuild. Note this is the same procedure you'd use if you were replacing the drive with another drive -- just in case that becomes necessary. Link to comment
garycase Posted September 16, 2015 Share Posted September 16, 2015 A comment r.e. your original plan ... 1) Copy data over to backup disk 2) Stop the array, remove the bad disk 3) start the array and continue using it till the replacment disk arrives from RMA 4) Copy over the backed up data back to the array after putting in the new disk. You do NOT want to do #3. If you're going to wait for a replacement drive, I would leave the array off until it arrives UNLESS you have a complete set of backups. Without that drive your array is completely unprotected -- should another drive fail, you'll lose all of the data on that drive. Link to comment
RobJ Posted September 16, 2015 Share Posted September 16, 2015 Can you please tell me how i can do that? The standard procedure is to unassign the drive, start and stop the array (to get unRAID to forget the drive), then re-assign the drive and start the array, which should begin the rebuild. Link to comment
HKR Posted September 17, 2015 Author Share Posted September 17, 2015 I have tried to replace the cables, switch the cables nothing seems to work. Although the disk says it passed SMART test, it just wont get detected by the system. I still get "Device is Disabled" Link to comment
garycase Posted September 17, 2015 Share Posted September 17, 2015 Did you un-assign the disk, as I noted above? You have to do that before you can re-assign it. Link to comment
trurl Posted September 17, 2015 Share Posted September 17, 2015 "Disabled" doesn't mean undetected. It means unRAID won't use it until it's rebuilt. Link to comment
HKR Posted September 18, 2015 Author Share Posted September 18, 2015 Thank you for all the help, the disk in question is working fine now. Data re-construction has been completed. Link to comment
HKR Posted September 21, 2015 Author Share Posted September 21, 2015 Guys, i still think there is something wrong with this disk. When the scheduled parity check started it used to usually take about 12-15hrs to complete, but right now its been on for over 2days and it says 75days to go... the moment it starts reading/writing Disk 5 data the parity check comes to a stand still. Are there any test we could run to confirm the disk is actually bad? Link to comment
garycase Posted September 21, 2015 Share Posted September 21, 2015 Did you replace the drive and rebuild it onto a new drive ... or did you rebuild it onto itself?? It sounds like the drive was in fact defective, but you may have rebuilt it onto itself -- and not actually confirmed that it was successful afterwards. Link to comment
Recommended Posts
Archived
This topic is now archived and is closed to further replies.