morikaweb Posted January 2, 2017 Share Posted January 2, 2017 This morning I was looking at my server when I noticed the parity drive became disabled suddenly. When I checked the logs it says: Jan 2 10:27:26 Candle-Keep emhttp: err: mdcmd: write: Input/output error Jan 2 10:27:26 Candle-Keep kernel: mdcmd (130): spindown 0 Jan 2 10:27:26 Candle-Keep kernel: sd 1:0:1:0: [sdi] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x04 driverbyte=0x00 Jan 2 10:27:26 Candle-Keep kernel: sd 1:0:1:0: [sdi] tag#0 CDB: opcode=0x85 85 06 20 00 00 00 00 00 00 00 00 00 00 40 e0 00 Jan 2 10:27:26 Candle-Keep kernel: md: do_drive_cmd: disk0: ATA_OP e0 ioctl error: -5 Jan 2 10:27:30 Candle-Keep emhttp: err: mdcmd: write: Input/output error Jan 2 10:27:30 Candle-Keep kernel: mdcmd (131): spindown 0 Jan 2 10:27:30 Candle-Keep kernel: sd 1:0:1:0: [sdi] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x04 driverbyte=0x00 Jan 2 10:27:30 Candle-Keep kernel: sd 1:0:1:0: [sdi] tag#0 CDB: opcode=0x85 85 06 20 00 00 00 00 00 00 00 00 00 00 40 e0 00 Jan 2 10:27:30 Candle-Keep kernel: md: do_drive_cmd: disk0: ATA_OP e0 ioctl error: -5 Jan 2 10:27:34 Candle-Keep kernel: sd 1:0:1:0: [sdi] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x04 driverbyte=0x00 Jan 2 10:27:34 Candle-Keep kernel: sd 1:0:1:0: [sdi] tag#0 CDB: opcode=0x85 85 06 20 00 00 00 00 00 00 00 00 00 00 40 e5 00 Jan 2 10:27:34 Candle-Keep kernel: sd 1:0:1:0: [sdi] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x04 driverbyte=0x00 Jan 2 10:27:34 Candle-Keep kernel: sd 1:0:1:0: [sdi] tag#0 CDB: opcode=0x85 85 06 20 00 00 00 00 00 00 00 00 00 00 40 98 00 Jan 2 10:27:34 Candle-Keep kernel: sd 1:0:1:0: [sdi] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x04 driverbyte=0x00 Jan 2 10:27:34 Candle-Keep kernel: sd 1:0:1:0: [sdi] tag#0 CDB: opcode=0x85 85 06 20 00 00 00 00 00 00 00 00 00 00 40 e5 00 Jan 2 10:27:34 Candle-Keep kernel: sd 1:0:1:0: [sdi] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x04 driverbyte=0x00 Jan 2 10:27:34 Candle-Keep kernel: sd 1:0:1:0: [sdi] tag#0 CDB: opcode=0x85 85 06 20 00 00 00 00 00 00 00 00 00 00 40 98 00 This is followed by a ton of: kernel: program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO I cannot get a smart report from the parity drive as it produces the following error: Terminate command early due to bad response to IEC mode page Does anyone have an idea what happened, is my parity drive dead? My syslog is here: http://pastebin.com/BxTDrqfT Quote Link to comment
JorgeB Posted January 2, 2017 Share Posted January 2, 2017 Reboot to see if it comes online and get a SMART report. Quote Link to comment
morikaweb Posted January 2, 2017 Author Share Posted January 2, 2017 Reboot to see if it comes online and get a SMART report. Will do, I am also going to open it up and make sure all the cables are snug. Quote Link to comment
morikaweb Posted January 2, 2017 Author Share Posted January 2, 2017 I stooped the box, checked all the cables but found nothing loose. I then powered it back up and it powered up with no errors, but the parity is still disabled. I managed to run a smart test which is here: http://pastebin.com/5r9XvGtY. I'm no expert but the smart log seems to say there is nothing wrong with the drive, this leaves me very concerned about what caused the error? Quote Link to comment
JorgeB Posted January 2, 2017 Share Posted January 2, 2017 Disk looks good, looking at the syslog the problem was the SASLP, it crashed, unRAID lost contact with the parity disk. Coincidentally the exact same thing happened to me this weekend, on the some controller, in my case the disk ejected from mine was a unassigned device used for the 2nd disk of my main VM, making it crash. Quote Link to comment
morikaweb Posted January 2, 2017 Author Share Posted January 2, 2017 Disk looks good, looking at the syslog the problem was the SASLP, it crashed, unRAID lost contact with the parity disk. Coincidentally the exact same thing happened to me this weekend, on the some controller, in my case the disk ejected from mine was a unassigned device used for the 2nd disk of my main VM, making it crash. I might have to look into replacing that controller then, it is getting old after all. It should be safe then to start a Read-Check of all data disks? Quote Link to comment
JorgeB Posted January 2, 2017 Share Posted January 2, 2017 Might as well begin a parity sync instead. Stop array, unassign parity, start array, stop array, reassign parity do begin sync. The controller may be OK, I hope mine is, although is this happens again I'll replace it. Quote Link to comment
morikaweb Posted January 2, 2017 Author Share Posted January 2, 2017 Thanks for your help you have been a lifesaver BTW do you think this is a good replacement card: http://www.ncix.com/detail/supermicro-aoc-sas2lp-mv8-8-channel-6gb-s-bf-62032.htm Quote Link to comment
JorgeB Posted January 2, 2017 Share Posted January 2, 2017 Yes you don't use VT-d (virtualization pass-trough), there are a few users with some issues with it if enable. I have 2 without any problems, but don't use VT-d on those servers. If you need VT-d recommend getting a LSI based controller, e.g., the 9211-8i, most get the Dell H310 or IBM M1015 because they are cheaper on ebay and can be crossflashed to LSI IT mode becoming for all purposes a LSI 9211-8i. Quote Link to comment
itimpi Posted January 2, 2017 Share Posted January 2, 2017 I have found that it is relatively easy tor the SASLP controller to end up not perfectly aligned with the motherboard as the connector is so short. This tends to lead to momentary disconnects, particularly when the system is under load. Well worth checking for that as it is by no means obvious at a quick glance. Quote Link to comment
morikaweb Posted January 2, 2017 Author Share Posted January 2, 2017 I have found that it is relatively easy tor the SASLP controller to end up not perfectly aligned with the motherboard as the connector is so short. This tends to lead to momentary disconnects, particularly when the system is under load. Well worth checking for that as it is by no means obvious at a quick glance. I thought of that but when I opened up the system it seemed flush. Anyways I have ordered an LSI 9211-8i which should resolve these issues I hope. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.