[Solved] Parity Drive disabled plus smartctl errors


Recommended Posts

This morning I was looking at my server when I noticed the parity drive became disabled suddenly. When I checked the logs it says:

 

Jan  2 10:27:26 Candle-Keep emhttp: err: mdcmd: write: Input/output error

Jan  2 10:27:26 Candle-Keep kernel: mdcmd (130): spindown 0

Jan  2 10:27:26 Candle-Keep kernel: sd 1:0:1:0: [sdi] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x04 driverbyte=0x00

Jan  2 10:27:26 Candle-Keep kernel: sd 1:0:1:0: [sdi] tag#0 CDB: opcode=0x85 85 06 20 00 00 00 00 00 00 00 00 00 00 40 e0 00

Jan  2 10:27:26 Candle-Keep kernel: md: do_drive_cmd: disk0: ATA_OP e0 ioctl error: -5

Jan  2 10:27:30 Candle-Keep emhttp: err: mdcmd: write: Input/output error

Jan  2 10:27:30 Candle-Keep kernel: mdcmd (131): spindown 0

Jan  2 10:27:30 Candle-Keep kernel: sd 1:0:1:0: [sdi] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x04 driverbyte=0x00

Jan  2 10:27:30 Candle-Keep kernel: sd 1:0:1:0: [sdi] tag#0 CDB: opcode=0x85 85 06 20 00 00 00 00 00 00 00 00 00 00 40 e0 00

Jan  2 10:27:30 Candle-Keep kernel: md: do_drive_cmd: disk0: ATA_OP e0 ioctl error: -5

Jan  2 10:27:34 Candle-Keep kernel: sd 1:0:1:0: [sdi] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x04 driverbyte=0x00

Jan  2 10:27:34 Candle-Keep kernel: sd 1:0:1:0: [sdi] tag#0 CDB: opcode=0x85 85 06 20 00 00 00 00 00 00 00 00 00 00 40 e5 00

Jan  2 10:27:34 Candle-Keep kernel: sd 1:0:1:0: [sdi] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x04 driverbyte=0x00

Jan  2 10:27:34 Candle-Keep kernel: sd 1:0:1:0: [sdi] tag#0 CDB: opcode=0x85 85 06 20 00 00 00 00 00 00 00 00 00 00 40 98 00

Jan  2 10:27:34 Candle-Keep kernel: sd 1:0:1:0: [sdi] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x04 driverbyte=0x00

Jan  2 10:27:34 Candle-Keep kernel: sd 1:0:1:0: [sdi] tag#0 CDB: opcode=0x85 85 06 20 00 00 00 00 00 00 00 00 00 00 40 e5 00

Jan  2 10:27:34 Candle-Keep kernel: sd 1:0:1:0: [sdi] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x04 driverbyte=0x00

Jan  2 10:27:34 Candle-Keep kernel: sd 1:0:1:0: [sdi] tag#0 CDB: opcode=0x85 85 06 20 00 00 00 00 00 00 00 00 00 00 40 98 00

 

 

This is followed by a ton of:

 

kernel: program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO

 

I cannot get a smart report from the parity drive as it produces the following error:

 

Terminate command early due to bad response to IEC mode page

 

Does anyone have an idea what happened, is my parity drive dead? My syslog is here: http://pastebin.com/BxTDrqfT

 

 

Link to comment

Disk looks good, looking at the syslog the problem was the SASLP, it crashed, unRAID lost contact with the parity disk.

 

Coincidentally the exact same thing happened to me this weekend, on the some controller, in my case the disk ejected from mine was a unassigned device used for the 2nd disk of my main VM, making it crash.

Link to comment

Disk looks good, looking at the syslog the problem was the SASLP, it crashed, unRAID lost contact with the parity disk.

 

Coincidentally the exact same thing happened to me this weekend, on the some controller, in my case the disk ejected from mine was a unassigned device used for the 2nd disk of my main VM, making it crash.

 

I might have to look into replacing that controller then, it is getting old after all. It should be safe then to start a Read-Check of all data disks?

Link to comment

Yes you don't use VT-d (virtualization pass-trough), there are a few users with some issues with it if enable.

 

I have 2 without any problems, but don't use VT-d on those servers.

 

If you need VT-d recommend getting a LSI based controller, e.g., the 9211-8i, most get the Dell H310 or IBM M1015 because they are cheaper on ebay and can be crossflashed to LSI IT mode becoming for all purposes a LSI 9211-8i.

Link to comment

I have found that it is relatively easy tor the SASLP controller to end up not perfectly aligned with the motherboard as the connector is so short.  This tends to lead to momentary disconnects, particularly when the system is under load.  Well worth checking for that as it is by no means obvious at a quick glance.

Link to comment

I have found that it is relatively easy tor the SASLP controller to end up not perfectly aligned with the motherboard as the connector is so short.  This tends to lead to momentary disconnects, particularly when the system is under load.  Well worth checking for that as it is by no means obvious at a quick glance.

 

I thought of that but when I opened up the system it seemed flush. Anyways I have ordered an LSI 9211-8i which should resolve these issues I hope.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.