Johnm Posted May 15, 2011 Share Posted May 15, 2011 I am sure this is normal, I just wanted to double check. I'm running 5.0 b6a My syslog is getting spammed hundreds of lines with Disk IO and SAS disconnect errors. I am on the preclear post read of /dev/sdi. I am assuming the drive is failing on the reads so the kernal is dropping/re-enabling the SAS port to try and correct the read error? While this is happening I have other drives on the same SAS channel/backplane and are functioning fine. I am not worried, but I just wanted to double check that I should not panic. I could be something deeper. I am pretty sure that drive is toast. It had a post-it on it to check SMART. I checked SMART it had 6 blocks pending re-allocation I went ahead and let it fix that then I ran the the Samsung tools extended tests with zeroing and it passed as A-OK!. (LIERS!) I popped it into unraid.. that's what I am seeing now. I'm sure it will now fail a SMART again I now have 3 F2's here to RMA, This will most likely make #4.(I loath samsung support) May 15 12:20:52 Goliath kernel: sas: sas_ata_task_done: SAS error 8d May 15 12:20:52 Goliath kernel: ata8: translated ATA stat/err 0x01/04 to SCSI SK/ASC/ASCQ 0xb/00/00 May 15 12:20:52 Goliath kernel: ata8: status=0x01 { Error } May 15 12:20:52 Goliath kernel: ata8: error=0x04 { DriveStatusError } May 15 12:20:52 Goliath kernel: sas: --- Exit sas_scsi_recover_host May 15 12:20:52 Goliath kernel: sas: sas_to_ata_err: Saw error 2. What to do? May 15 12:20:52 Goliath kernel: sas: sas_ata_task_done: SAS error 2 May 15 12:20:52 Goliath kernel: ata8: translated ATA stat/err 0x01/04 to SCSI SK/ASC/ASCQ 0xb/00/00 May 15 12:20:52 Goliath kernel: ata8: status=0x01 { Error } May 15 12:20:52 Goliath kernel: ata8: error=0x04 { DriveStatusError } May 15 12:20:52 Goliath kernel: sd 0:0:1:0: [sdi] Result: hostbyte=0x00 driverbyte=0x08 May 15 12:20:52 Goliath kernel: sd 0:0:1:0: [sdi] Sense Key : 0xb [current] [descriptor] May 15 12:20:52 Goliath kernel: Descriptor sense data with sense descriptors (in hex): May 15 12:20:52 Goliath kernel: 72 0b 00 00 00 00 00 0c 00 0a 80 00 00 00 00 00 May 15 12:20:52 Goliath kernel: 00 00 00 2f May 15 12:20:52 Goliath kernel: sd 0:0:1:0: [sdi] ASC=0x0 ASCQ=0x0 May 15 12:20:52 Goliath kernel: sd 0:0:1:0: [sdi] CDB: cdb[0]=0x28: 28 00 59 b6 82 38 00 00 08 00 May 15 12:20:52 Goliath kernel: end_request: I/O error, dev sdi, sector 1505133112 May 15 12:20:52 Goliath kernel: Buffer I/O error on device sdi, logical block 188141639 May 15 12:21:23 Goliath kernel: sas: command 0xda2c4b40, task 0xd7dec640, timed out: BLK_EH_NOT_HANDLED May 15 12:21:23 Goliath kernel: sas: Enter sas_scsi_recover_host May 15 12:21:23 Goliath kernel: sas: trying to find task 0xd7dec640 May 15 12:21:23 Goliath kernel: sas: sas_scsi_find_task: aborting task 0xd7dec640 May 15 12:21:23 Goliath kernel: drivers/scsi/mvsas/mv_sas.c 1703:<7>mv_abort_task() mvi=c51c0000 task=d7dec640 slot=c51d15d4 slot_idx=x0 May 15 12:21:23 Goliath kernel: sas: sas_scsi_find_task: querying task 0xd7dec640 May 15 12:21:23 Goliath kernel: drivers/scsi/mvsas/mv_sas.c 1632:mvs_query_task:rc= 5 May 15 12:21:23 Goliath kernel: sas: sas_scsi_find_task: task 0xd7dec640 failed to abort May 15 12:21:23 Goliath kernel: sas: task 0xd7dec640 is not at LU: I_T recover May 15 12:21:23 Goliath kernel: sas: I_T nexus reset for dev 0000000000000000 May 15 12:21:23 Goliath kernel: drivers/scsi/mvsas/mv_sas.c 2083:port 0 ctrl sts=0x89800. May 15 12:21:23 Goliath kernel: drivers/scsi/mvsas/mv_sas.c 2085:Port 0 irq sts = 0x1001 May 15 12:21:23 Goliath kernel: drivers/scsi/mvsas/mv_sas.c 2111:phy0 Unplug Notice May 15 12:21:23 Goliath kernel: drivers/scsi/mvsas/mv_sas.c 2083:port 0 ctrl sts=0x199800. May 15 12:21:23 Goliath kernel: drivers/scsi/mvsas/mv_sas.c 2085:Port 0 irq sts = 0x1081 May 15 12:21:23 Goliath kernel: drivers/scsi/mvsas/mv_sas.c 2083:port 0 ctrl sts=0x199800. May 15 12:21:23 Goliath kernel: drivers/scsi/mvsas/mv_sas.c 2085:Port 0 irq sts = 0x10000 May 15 12:21:23 Goliath kernel: drivers/scsi/mvsas/mv_sas.c 2138:notify plug in on phy[0] May 15 12:21:23 Goliath kernel: drivers/scsi/mvsas/mv_sas.c 1224:port 0 attach dev info is 0 May 15 12:21:23 Goliath kernel: drivers/scsi/mvsas/mv_sas.c 1226:port 0 attach sas addr is 0 May 15 12:21:23 Goliath kernel: drivers/scsi/mvsas/mv_sas.c 378:phy 0 byte dmaded. May 15 12:21:23 Goliath kernel: sas: sas_form_port: phy0 belongs to port1 already(1)! May 15 12:21:25 Goliath kernel: drivers/scsi/mvsas/mv_sas.c 1586:mvs_I_T_nexus_reset for device[1]:rc= 0 May 15 12:21:25 Goliath kernel: sas: I_T 0000000000000000 recovered May 15 12:21:25 Goliath kernel: sas: sas_ata_task_done: SAS error 8d May 15 12:21:25 Goliath kernel: ata8: translated ATA stat/err 0x01/04 to SCSI SK/ASC/ASCQ 0xb/00/00 May 15 12:21:25 Goliath kernel: ata8.00: device reported invalid CHS sector 0 May 15 12:21:25 Goliath kernel: ata8: status=0x01 { Error } May 15 12:21:25 Goliath kernel: ata8: error=0x04 { DriveStatusError } May 15 12:21:25 Goliath kernel: sas: --- Exit sas_scsi_recover_host May 15 12:21:25 Goliath kernel: sas: sas_to_ata_err: Saw error 2. What to do? May 15 12:21:25 Goliath kernel: sas: sas_ata_task_done: SAS error 2 May 15 12:21:25 Goliath kernel: ata8: translated ATA stat/err 0x01/04 to SCSI SK/ASC/ASCQ 0xb/00/00 May 15 12:21:25 Goliath kernel: ata8.00: device reported invalid CHS sector 0 May 15 12:21:25 Goliath kernel: ata8: status=0x01 { Error } May 15 12:21:25 Goliath kernel: ata8: error=0x04 { DriveStatusError } Quote Link to comment
teamhood Posted May 17, 2011 Share Posted May 17, 2011 I too am seeing this error and trying to figure it out.... Quote Link to comment
Johnm Posted May 17, 2011 Author Share Posted May 17, 2011 Mine was a bad drive. on the post read from the preclear, it took 19 hours to get from 48% to 51%. i then pulled the drive and errors went away.. drive is toast... 100's of bad blocks the ironic part, this drive was my last 1.5 i had that i thought worked as a spare for a raid array, while in precrear. the raid dropped a drive. first time one dropped in that server in about 2 years. i now have 5 of this model to RMA. Quote Link to comment
teamhood Posted May 17, 2011 Share Posted May 17, 2011 Interesting so this is with the Samsung 1.5TB's eh? I do have 2 or 3 of those... Quote Link to comment
Johnm Posted May 17, 2011 Author Share Posted May 17, 2011 I have a about 18 of them that got when they first came out. they all went into raid arrays. as I have been replacing them for larger drives they have been going into other things (like unraid). They have been pretty stable so far. I had one DOA that i never RMA'd. Then had 4 go bad in the last 3 months. All of the 1.5's i took out of service were replaced by F4's. so far not one has gone bad. I have about 16 of those now. to bad they got bought out.. in this case I am pretty sure it was the drive in my case. I think this one came from my WHS box and had smart errors already. Quote Link to comment
cylon Posted May 23, 2011 Share Posted May 23, 2011 I am getting the same error but am not sure how to narrow down which drive it is: May 23 17:37:37 unRAID kernel: ata7: status=0x51 { DriveReady SeekComplete Error } (Errors) May 23 17:37:37 unRAID kernel: ata7: error=0x04 { DriveStatusError } (Errors) May 23 17:37:37 unRAID kernel: ata7: translated ATA stat/err 0x51/04 to SCSI SK/ASC/ASCQ 0xb/00/00 (Drive related) I thought ata7 related to disk 7 in the array which is incidentally a 1.5TB Samsung. So I copied it's contents to another drive and unassigned the drive as per the instructions on the WIKI. The errors continue and I think it might be the parity drive (Samsung 2TB) as I am also getting lots of parity errors. How do I work out which drive is connected to ata7? Quote Link to comment
Johnm Posted May 23, 2011 Author Share Posted May 23, 2011 If Im not mistaken. At the start of the syslog, it tells you what drive is ataX The error is not specific to a samsung. It looks like it is just a smart error in general. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.