DriveStatusError

Johnm · May 15, 2011

I am sure this is normal, I just wanted to double check.

I'm running 5.0 b6a

My syslog is getting spammed hundreds of lines with Disk IO and SAS disconnect errors.

I am on the preclear post read of /dev/sdi.

I am assuming the drive is failing on the reads so the kernal is dropping/re-enabling the SAS port to try and correct the read error?

While this is happening I have other drives on the same SAS channel/backplane and are functioning fine.

I am not worried, but I just wanted to double check that I should not panic.

I could be something deeper.

I am pretty sure that drive is toast.

It had a post-it on it to check SMART.

I checked SMART it had 6 blocks pending re-allocation

I went ahead and let it fix that then I ran the the Samsung tools extended tests with zeroing and it passed as A-OK!. (LIERS!)

I popped it into unraid.. that's what I am seeing now. I'm sure it will now fail a SMART again

I now have 3 F2's here to RMA, This will most likely make #4.(I loath samsung support)

May 15 12:20:52 Goliath kernel: sas: sas_ata_task_done: SAS error 8d
May 15 12:20:52 Goliath kernel: ata8: translated ATA stat/err 0x01/04 to SCSI SK/ASC/ASCQ 0xb/00/00
May 15 12:20:52 Goliath kernel: ata8: status=0x01 { Error }
May 15 12:20:52 Goliath kernel: ata8: error=0x04 { DriveStatusError }
May 15 12:20:52 Goliath kernel: sas: --- Exit sas_scsi_recover_host
May 15 12:20:52 Goliath kernel: sas: sas_to_ata_err: Saw error 2. What to do?
May 15 12:20:52 Goliath kernel: sas: sas_ata_task_done: SAS error 2
May 15 12:20:52 Goliath kernel: ata8: translated ATA stat/err 0x01/04 to SCSI SK/ASC/ASCQ 0xb/00/00
May 15 12:20:52 Goliath kernel: ata8: status=0x01 { Error }
May 15 12:20:52 Goliath kernel: ata8: error=0x04 { DriveStatusError }
May 15 12:20:52 Goliath kernel: sd 0:0:1:0: [sdi] Result: hostbyte=0x00 driverbyte=0x08
May 15 12:20:52 Goliath kernel: sd 0:0:1:0: [sdi] Sense Key : 0xb [current] [descriptor]
May 15 12:20:52 Goliath kernel: Descriptor sense data with sense descriptors (in hex):
May 15 12:20:52 Goliath kernel: 72 0b 00 00 00 00 00 0c 00 0a 80 00 00 00 00 00 
May 15 12:20:52 Goliath kernel: 00 00 00 2f 
May 15 12:20:52 Goliath kernel: sd 0:0:1:0: [sdi] ASC=0x0 ASCQ=0x0
May 15 12:20:52 Goliath kernel: sd 0:0:1:0: [sdi] CDB: cdb[0]=0x28: 28 00 59 b6 82 38 00 00 08 00
May 15 12:20:52 Goliath kernel: end_request: I/O error, dev sdi, sector 1505133112
May 15 12:20:52 Goliath kernel: Buffer I/O error on device sdi, logical block 188141639
May 15 12:21:23 Goliath kernel: sas: command 0xda2c4b40, task 0xd7dec640, timed out: BLK_EH_NOT_HANDLED
May 15 12:21:23 Goliath kernel: sas: Enter sas_scsi_recover_host
May 15 12:21:23 Goliath kernel: sas: trying to find task 0xd7dec640
May 15 12:21:23 Goliath kernel: sas: sas_scsi_find_task: aborting task 0xd7dec640
May 15 12:21:23 Goliath kernel: drivers/scsi/mvsas/mv_sas.c 1703:<7>mv_abort_task() mvi=c51c0000 task=d7dec640 slot=c51d15d4 slot_idx=x0
May 15 12:21:23 Goliath kernel: sas: sas_scsi_find_task: querying task 0xd7dec640
May 15 12:21:23 Goliath kernel: drivers/scsi/mvsas/mv_sas.c 1632:mvs_query_task:rc= 5
May 15 12:21:23 Goliath kernel: sas: sas_scsi_find_task: task 0xd7dec640 failed to abort
May 15 12:21:23 Goliath kernel: sas: task 0xd7dec640 is not at LU: I_T recover
May 15 12:21:23 Goliath kernel: sas: I_T nexus reset for dev 0000000000000000
May 15 12:21:23 Goliath kernel: drivers/scsi/mvsas/mv_sas.c 2083:port 0 ctrl sts=0x89800.
May 15 12:21:23 Goliath kernel: drivers/scsi/mvsas/mv_sas.c 2085:Port 0 irq sts = 0x1001
May 15 12:21:23 Goliath kernel: drivers/scsi/mvsas/mv_sas.c 2111:phy0 Unplug Notice
May 15 12:21:23 Goliath kernel: drivers/scsi/mvsas/mv_sas.c 2083:port 0 ctrl sts=0x199800.
May 15 12:21:23 Goliath kernel: drivers/scsi/mvsas/mv_sas.c 2085:Port 0 irq sts = 0x1081
May 15 12:21:23 Goliath kernel: drivers/scsi/mvsas/mv_sas.c 2083:port 0 ctrl sts=0x199800.
May 15 12:21:23 Goliath kernel: drivers/scsi/mvsas/mv_sas.c 2085:Port 0 irq sts = 0x10000
May 15 12:21:23 Goliath kernel: drivers/scsi/mvsas/mv_sas.c 2138:notify plug in on phy[0]
May 15 12:21:23 Goliath kernel: drivers/scsi/mvsas/mv_sas.c 1224:port 0 attach dev info is 0
May 15 12:21:23 Goliath kernel: drivers/scsi/mvsas/mv_sas.c 1226:port 0 attach sas addr is 0
May 15 12:21:23 Goliath kernel: drivers/scsi/mvsas/mv_sas.c 378:phy 0 byte dmaded.
May 15 12:21:23 Goliath kernel: sas: sas_form_port: phy0 belongs to port1 already(1)!
May 15 12:21:25 Goliath kernel: drivers/scsi/mvsas/mv_sas.c 1586:mvs_I_T_nexus_reset for device[1]:rc= 0
May 15 12:21:25 Goliath kernel: sas: I_T 0000000000000000 recovered
May 15 12:21:25 Goliath kernel: sas: sas_ata_task_done: SAS error 8d
May 15 12:21:25 Goliath kernel: ata8: translated ATA stat/err 0x01/04 to SCSI SK/ASC/ASCQ 0xb/00/00
May 15 12:21:25 Goliath kernel: ata8.00: device reported invalid CHS sector 0
May 15 12:21:25 Goliath kernel: ata8: status=0x01 { Error }
May 15 12:21:25 Goliath kernel: ata8: error=0x04 { DriveStatusError }
May 15 12:21:25 Goliath kernel: sas: --- Exit sas_scsi_recover_host
May 15 12:21:25 Goliath kernel: sas: sas_to_ata_err: Saw error 2. What to do?
May 15 12:21:25 Goliath kernel: sas: sas_ata_task_done: SAS error 2
May 15 12:21:25 Goliath kernel: ata8: translated ATA stat/err 0x01/04 to SCSI SK/ASC/ASCQ 0xb/00/00
May 15 12:21:25 Goliath kernel: ata8.00: device reported invalid CHS sector 0
May 15 12:21:25 Goliath kernel: ata8: status=0x01 { Error }
May 15 12:21:25 Goliath kernel: ata8: error=0x04 { DriveStatusError }

teamhood · May 17, 2011

I too am seeing this error and trying to figure it out....

Johnm · May 17, 2011

Mine was a bad drive.

on the post read from the preclear, it took 19 hours to get from 48% to 51%. i then pulled the drive and errors went away..

drive is toast... 100's of bad blocks

the ironic part, this drive was my last 1.5 i had that i thought worked as a spare for a raid array, while in precrear. the raid dropped a drive. first time one dropped in that server in about 2 years.

i now have 5 of this model to RMA.

teamhood · May 17, 2011

Interesting so this is with the Samsung 1.5TB's eh? I do have 2 or 3 of those...

Johnm · May 17, 2011

I have a about 18 of them that got when they first came out. they all went into raid arrays. as I have been replacing them for larger drives they have been going into other things (like unraid).

They have been pretty stable so far. I had one DOA that i never RMA'd. Then had 4 go bad in the last 3 months.

All of the 1.5's i took out of service were replaced by F4's. so far not one has gone bad. I have about 16 of those now. to bad they got bought out..

in this case I am pretty sure it was the drive in my case. I think this one came from my WHS box and had smart errors already.

cylon · May 23, 2011

I am getting the same error but am not sure how to narrow down which drive it is:

May 23 17:37:37 unRAID kernel: ata7: status=0x51 { DriveReady SeekComplete Error } (Errors)
May 23 17:37:37 unRAID kernel: ata7: error=0x04 { DriveStatusError } (Errors)
May 23 17:37:37 unRAID kernel: ata7: translated ATA stat/err 0x51/04 to SCSI SK/ASC/ASCQ 0xb/00/00 (Drive related)

I thought ata7 related to disk 7 in the array which is incidentally a 1.5TB Samsung. So I copied it's contents to another drive and unassigned the drive as per the instructions on the WIKI. The errors continue and I think it might be the parity drive (Samsung 2TB) as I am also getting lots of parity errors.

How do I work out which drive is connected to ata7?

Johnm · May 23, 2011

If Im not mistaken. At the start of the syslog, it tells you what drive is ataX

The error is not specific to a samsung. It looks like it is just a smart error in general.

DriveStatusError

Recommended Posts

Johnm

Link to comment

teamhood

Link to comment

Johnm

Link to comment

teamhood

Link to comment

Johnm

Link to comment

cylon

Link to comment

Johnm

Link to comment

Join the conversation