tyrindor Posted July 6, 2017 Share Posted July 6, 2017 Just had a disk taken offline, a disk that has my most important files. I am copying them from the emulated drive to my windows PC for a quick backup. Not sure how to check SMART of the disk, it won't let me says "Can not read attributes". Is it completely dead or something? I'm not sure the best way to go about this after I copy my files to another PC. It's not letting me run any tests on the drive. Its repeating these messages over and over now: Quote Jun 29 13:34:04 UNRAID kernel: sd 1:0:6:0: [sdi] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x04 driverbyte=0x00 Jun 29 13:34:04 UNRAID kernel: sd 1:0:6:0: [sdi] tag#0 CDB: opcode=0x85 85 06 20 00 00 00 00 00 00 00 00 00 00 40 e5 00 Jun 29 13:34:04 UNRAID kernel: sd 1:0:6:0: [sdi] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x04 driverbyte=0x00 Jun 29 13:34:04 UNRAID kernel: sd 1:0:6:0: [sdi] tag#0 CDB: opcode=0x85 85 06 20 00 00 00 00 00 00 00 00 00 00 40 98 00 Jun 29 13:34:04 UNRAID kernel: program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO Jun 29 13:34:04 UNRAID kernel: program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO Jun 29 13:34:04 UNRAID kernel: program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO Here's exactly where the crash happened in the log. Seems like it threw a SAS error, followed by read/write errors on disk1. aaa.txt Link to comment
tyrindor Posted July 6, 2017 Author Share Posted July 6, 2017 Restarting the server, drive still didn't show up. Hotswapped it and now it showed up. SMART looks fine? Passes quick SMART test... Should I rebuild the data back onto it, or disable parity and build a new parity. My last parity check was 36 days ago, so there is a chance it could be wrong if I have SAS errors going around. Either way I already transferred off my irreplaceable files. My guess is that because this is a shingled archive drive, something got botched up and the SAS card freaked out when it couldn't write immediately? 1 Raw read error rate 0x000f 114 099 006 Pre-fail Always Never 70247960 3 Spin up time 0x0003 090 090 000 Pre-fail Always Never 0 4 Start stop count 0x0032 100 100 020 Old age Always Never 704 5 Reallocated sector count 0x0033 100 100 010 Pre-fail Always Never 0 7 Seek error rate 0x000f 079 060 030 Pre-fail Always Never 85358140 9 Power on hours 0x0032 089 089 000 Old age Always Never 10194 (1y, 1m, 28d, 18h) 10 Spin retry count 0x0013 100 100 097 Pre-fail Always Never 0 12 Power cycle count 0x0032 100 100 020 Old age Always Never 49 183 Runtime bad block 0x0032 100 100 000 Old age Always Never 0 184 End-to-end error 0x0032 100 100 099 Old age Always Never 0 187 Reported uncorrect 0x0032 100 100 000 Old age Always Never 0 188 Command timeout 0x0032 100 100 000 Old age Always Never 0 189 High fly writes 0x003a 100 100 000 Old age Always Never 0 190 Airflow temperature cel 0x0022 068 049 045 Old age Always Never 32 (min/max 32/32) 191 G-sense error rate 0x0032 100 100 000 Old age Always Never 0 192 Power-off retract count 0x0032 100 100 000 Old age Always Never 151 193 Load cycle count 0x0032 100 100 000 Old age Always Never 1423 194 Temperature celsius 0x0022 032 051 000 Old age Always Never 32 (0 20 0 0 0) 195 Hardware ECC recovered 0x001a 114 099 000 Old age Always Never 70247960 197 Current pending sector 0x0012 100 100 000 Old age Always Never 0 198 Offline uncorrectable 0x0010 100 100 000 Old age Offline Never 0 199 UDMA CRC error count 0x003e 200 200 000 Old age Always Never 0 240 Head flying hours 0x0000 100 253 000 Old age Offline Never 1247 (205 155 0) 241 Total lbas written 0x0000 100 253 000 Old age Offline Never 28089705984 242 Total lbas read 0x0000 100 253 000 Old age Offline Never 258281706892 Link to comment
JorgeB Posted July 6, 2017 Share Posted July 6, 2017 9 minutes ago, tyrindor said: My guess is that because this is a shingled archive drive, something got botched up and the SAS card freaked out when it couldn't write immediately? Possible bit IMO not likely, SMART looks fine, rebuild to the same disk, you could swap cables/backplane with another disk before rebuilding just to rule that out. Link to comment
tyrindor Posted July 6, 2017 Author Share Posted July 6, 2017 K, rebuilding to same disk and i'll see how it goes. My setup is pure SAS cables so it's 1 SAS cable for 4 drives. I don't have any spares sadly, but i'll move the drive to another slot if it happens again. Link to comment
JorgeB Posted July 6, 2017 Share Posted July 6, 2017 12 minutes ago, tyrindor said: but i'll move the drive to another slot if it happens again. That works, use a slot tha uses a different SAS cable also. Link to comment
tyrindor Posted July 6, 2017 Author Share Posted July 6, 2017 15% into the rebuild and still nothing wrong... I'm not sure if it's a good thing or bad thing. Something definitely went wrong, but if the rebuild goes fine then who knows what. Link to comment
tyrindor Posted July 7, 2017 Author Share Posted July 7, 2017 Data rebuild was successful. SMART still has nothing wrong in it. No errors in log. I don't understand what the problem was. Either a SAS cable/card/hotswap bay had a fluke... or the Seagate Archive drive had an issue with it's shingling technology which resulted in unRAID thinking the drive was unresponsive. I'll keep my fingers crossed that it doesn't happen again. Link to comment
Recommended Posts
Archived
This topic is now archived and is closed to further replies.