Bungy Posted March 3, 2015 Share Posted March 3, 2015 I have two drive currently in my array that may need some attention. I just upgraded to unraid 6 beta16b and turned on emailing smart report changes. Last night I received a report about two different drives that have not failed and never red-balled, but they may be showing signs of future failure and I wanted to get a second opinion. Disk 3: My concerns are with Current Pending Sector and Offline Uncorrectable. Disk 3 attached to port: sdl ID# ATTRIBUTE NAME FLAG VALUE WORST THRESH TYPE UPDATED FAILED RAW VALUE 1 Raw Read Error Rate 0x002f 200 200 051 Pre-fail Always Never 0 3 Spin Up Time 0x0027 164 159 021 Pre-fail Always Never 6783 4 Start Stop Count 0x0032 096 096 000 Old age Always Never 4318 5 Reallocated Sector Ct 0x0033 200 200 140 Pre-fail Always Never 0 7 Seek Error Rate 0x002e 200 200 000 Old age Always Never 0 9 Power On Hours 0x0032 048 048 000 Old age Always Never 38032 10 Spin Retry Count 0x0032 100 100 000 Old age Always Never 0 11 Calibration Retry Count 0x0032 100 100 000 Old age Always Never 0 12 Power Cycle Count 0x0032 100 100 000 Old age Always Never 235 192 Power-Off Retract Count 0x0032 200 200 000 Old age Always Never 135 193 Load Cycle Count 0x0032 074 074 000 Old age Always Never 379826 194 Temperature Celsius 0x0022 120 098 000 Old age Always Never 30 196 Reallocated Event Count 0x0032 200 200 000 Old age Always Never 0 197 Current Pending Sector 0x0032 200 200 000 Old age Always Never 20 198 Offline Uncorrectable 0x0030 200 200 000 Old age Offline Never 3 199 UDMA CRC Error Count 0x0032 200 200 000 Old age Always Never 0 200 Multi Zone Error Rate 0x0008 200 200 000 Old age Offline Never 25 Disk 10: My concerns are with Reported Uncorrect, Command Timeout, and Current Pending Sector. It seems the command timeout may be due to a bad cable, but I haven't had a chance to get to the cable to check this. Disk 10 attached to port: sdh ID# ATTRIBUTE NAME FLAG VALUE WORST THRESH TYPE UPDATED FAILED RAW VALUE 1 Raw Read Error Rate 0x002f 200 199 051 Pre-fail Always Never 0 3 Spin Up Time 0x0027 170 168 021 Pre-fail Always Never 6466 4 Start Stop Count 0x0032 097 097 000 Old age Always Never 3756 5 Reallocated Sector Ct 0x0033 200 200 140 Pre-fail Always Never 0 7 Seek Error Rate 0x002f 100 253 051 Pre-fail Always Never 0 9 Power On Hours 0x0032 060 060 000 Old age Always Never 29646 10 Spin Retry Count 0x0033 100 100 051 Pre-fail Always Never 0 11 Calibration Retry Count 0x0032 100 100 000 Old age Always Never 0 12 Power Cycle Count 0x0032 100 100 000 Old age Always Never 142 184 End-to-End Error 0x0033 100 100 097 Pre-fail Always Never 0 187 Reported Uncorrect 0x0032 100 099 000 Old age Always Never 1 188 Command Timeout 0x0032 100 099 000 Old age Always Never 4295032834 190 Airflow Temperature Cel 0x0022 067 049 040 Old age Always Never 33 (Min/Max 24/37) 192 Power-Off Retract Count 0x0032 200 200 000 Old age Always Never 69 193 Load Cycle Count 0x0032 199 199 000 Old age Always Never 3686 196 Reallocated Event Count 0x0032 200 200 000 Old age Always Never 0 197 Current Pending Sector 0x0032 200 200 000 Old age Always Never 1 198 Offline Uncorrectable 0x0030 200 200 000 Old age Offline Never 0 199 UDMA CRC Error Count 0x0032 200 200 000 Old age Always Never 0 200 Multi Zone Error Rate 0x0008 200 200 000 Old age Offline Never 0 I'm currently pulling all of the data off of both drives onto a backup drive. My question is how big of a problem are these attributes? I'm guessing I can preclear disk3 to clear the pending sectors, but the offline uncorrectable is a concern. I doubt I can add it back to the array and have peace of mind with the potential for the offline uncorrectable to increase. I also don't think I can trust disk 10 with the large number of command timeouts. If any of you have any thoughts, I would love a second opinion. Link to comment
WeeboTech Posted March 3, 2015 Share Posted March 3, 2015 I would check cable on the drive with timeouts. I would verify all data if possible on the drives with pending sectors. Possibly move the data then clear the drive to clear out the pending sectors. or retire them, the power on hours are pretty high. You can read about the SMART attributes here. http://en.wikipedia.org/wiki/S.M.A.R.T. Link to comment
Frank1940 Posted March 3, 2015 Share Posted March 3, 2015 If it were my server, I would be looking at getting a couple of new drives. One is over three years old and the other is over four. I, personally, am always proactive about getting any questionable drives out of my array and testing them later to see if they can be savaged... Remember if you do have even one bad drive that can not be read, you have a fifty-fifty chance of data loss if you try to replace one of them and close to a 100% chance if have a problem with another drive. It is the potential failure/problem with second drive that worries me when I have a drive that is problematic. (I am a true believer in Murphy's Law.) Link to comment
Bungy Posted March 3, 2015 Author Share Posted March 3, 2015 Thanks for the responses. I'm glad to hear I'm not in a very desperate situation. I happen to have a 4TB drive that just came in the mail yesterday. My plan is to upgrade disk 3 (2TB) to 4TB and then copy my data off of disk 10 (2TB) to the new drive. Is there a way to remove disk 10 from the array once it's been copied to the 4tb drive? Could I write 0's to the partition so that the parity is updated, remove the drive, do a trust my array with the disk missing, and then of course do a parity sync at the end just to be sure? Link to comment
JonathanM Posted March 3, 2015 Share Posted March 3, 2015 Thanks for the responses. I'm glad to hear I'm not in a very desperate situation. I happen to have a 4TB drive that just came in the mail yesterday. My plan is to upgrade disk 3 (2TB) to 4TB and then copy my data off of disk 10 (2TB) to the new drive. Is there a way to remove disk 10 from the array once it's been copied to the 4tb drive? Could I write 0's to the partition so that the parity is updated, remove the drive, do a trust my array with the disk missing, and then of course do a parity sync at the end just to be sure? Yes, if you dd zeroes to the md? device you wish to remove, you can do exactly what you want. Works quite smoothly, and would probably be a normal procedure if it wasn't for the high risk of accidentally nuking the wrong drive if you aren't careful or don't know what you are doing and blindly following a guide. All this assumes your parity drive is currently 4TB or larger. Link to comment
Bungy Posted March 3, 2015 Author Share Posted March 3, 2015 Thanks for the help. When the copying is done, I'll run this command to clear out the disk before removing it from the array dd if=/dev/zero of=/dev/mdX bs=2048k I've been meaning to clean out the old WD greens. I'm just glad they lasted 3-4 years. Link to comment
trurl Posted March 3, 2015 Share Posted March 3, 2015 Thanks for the help. When the copying is done, I'll run this command to clear out the disk before removing it from the array dd if=/dev/zero of=/dev/mdX bs=2048k I've been meaning to clean out the old WD greens. I'm just glad they lasted 3-4 years. If you zero the disk this way, you will still have valid parity if you remove the disk, but you will have to set a new config after you remove it and check the trust parity box. Link to comment
Recommended Posts
Archived
This topic is now archived and is closed to further replies.