johnodon Posted August 4, 2015 Share Posted August 4, 2015 Diagnostics attached. I was running a parity check and DISK2 decided to check out at some point. What are my next steps? Do I need to replace it? Does the log indicate another issue (i.e. SATA cable)? Naturally, my main concern is the health of my parity. TIA! John unraid-diagnostics-20150804-0635.zip Link to comment
trurl Posted August 4, 2015 Share Posted August 4, 2015 ...Naturally, my main concern is the health of my parity. Do you have any reason to think your parity is bad? The disabled disk doesn't have any smart data in the diagnostics. Do you have another disk for the rebuild? Link to comment
itimpi Posted August 4, 2015 Share Posted August 4, 2015 It looks as if disk2 has probably dropped offline. Rebooting the system may bring it back so that the SMART attributes can be checked. I notice that disk3 also has one Pending sector. This could mean that during a rebuild of another disk the corresponding sector will be rebuilt with that sector incorrect. Link to comment
johnodon Posted August 4, 2015 Author Share Posted August 4, 2015 ...Naturally, my main concern is the health of my parity. Do you have any reason to think your parity is bad? Only because I saw the one pending sector the itimpi mentioned and I have not run a parity check in a month or so. The disabled disk doesn't have any smart data in the diagnostics. Do you have another disk for the rebuild? I don't have a spare on hand. I rebooted the server and DISK2 did come back online. I unassigned/reassigned it and it is rebuilding it now. If it chokes, I'll run out to a local store and get a replacement. Link to comment
johnodon Posted August 4, 2015 Author Share Posted August 4, 2015 It looks as if disk2 has probably dropped offline. Rebooting the system may bring it back so that the SMART attributes can be checked. SMART Attributes for DISK2: # Attribute Name Flag Value Worst Threshold Type Updated Failed Raw Value 1 Raw Read Error Rate 0x002f 200 200 051 Pre-fail Always - 0 3 Spin Up Time 0x0027 226 216 021 Pre-fail Always - 8700 4 Start Stop Count 0x0032 100 100 000 Old age Always - 915 5 Reallocated Sector Ct 0x0033 200 200 140 Pre-fail Always - 0 7 Seek Error Rate 0x002e 200 200 000 Old age Always - 0 9 Power On Hours 0x0032 029 029 000 Old age Always - 52195 (5y, 349d, 19h) 10 Spin Retry Count 0x0032 100 100 000 Old age Always - 0 11 Calibration Retry Count 0x0032 100 100 000 Old age Always - 0 12 Power Cycle Count 0x0032 100 100 000 Old age Always - 494 192 Power-Off Retract Count 0x0032 200 200 000 Old age Always - 347 193 Load Cycle Count 0x0032 200 200 000 Old age Always - 915 194 Temperature Celsius 0x0022 112 102 000 Old age Always - 38 196 Reallocated Event Count 0x0032 200 200 000 Old age Always - 0 197 Current Pending Sector 0x0032 200 200 000 Old age Always - 0 198 Offline Uncorrectable 0x0030 200 200 000 Old age Offline - 0 199 UDMA CRC Error Count 0x0032 200 200 000 Old age Always - 0 200 Multi Zone Error Rate 0x0008 200 200 000 Old age Offline - 0 I notice that disk3 also has one Pending sector. This could mean that during a rebuild of another disk the corresponding sector will be rebuilt with that sector incorrect. How do you usually deal with Pending Sectors? Thanks for the help guys! John Link to comment
itimpi Posted August 4, 2015 Share Posted August 4, 2015 Nothing springs out from that SMART report to indicate that disk2 is in trouble. With any luck the rebuild back onto itself will go fine. The vast majority of times that users get a disk red-balled it seems to be a transient issue. I think the most likely area is a momentary disconnect at the cabling level due to vibration or temperature effects but that is just supposition. Link to comment
johnodon Posted August 4, 2015 Author Share Posted August 4, 2015 Nothing springs out from that SMART report to indicate that disk2 is in trouble. With any luck the rebuild back onto itself will go fine. The vast majority of times that users get a disk red-balled it seems to be a transient issue. I think the most likely area is a momentary disconnect at the cabling level due to vibration or temperature effects but that is just supposition. Funny you mention temp. That drive (and the one next to it) has been floating around 100F while all other drives are high 80's low 90's. I'll check the fan on that bay to make sure it is spinning. John Link to comment
johnodon Posted August 4, 2015 Author Share Posted August 4, 2015 DING DING DING!! I had a stalled drive bay fan. I think I SATA cable was the culprit. It has been running for 30 mins now and temp has dropped 7 degrees and 11 degrees on those two drives. John Link to comment
Recommended Posts
Archived
This topic is now archived and is closed to further replies.