February 4, 20179 yr Since I have had several disk errors lately I am kind of keeping an eye on the syslog.. I now see the following: Jan 29 21:52:32 Tower kernel: sas: Enter sas_scsi_recover_host busy: 1 failed: 1 Jan 29 21:52:32 Tower kernel: sas: ata11: end_device-1:4: cmd error handler Jan 29 21:52:32 Tower kernel: sas: ata7: end_device-1:0: dev error handler Jan 29 21:52:32 Tower kernel: sas: ata8: end_device-1:1: dev error handler Jan 29 21:52:32 Tower kernel: sas: ata9: end_device-1:2: dev error handler Jan 29 21:52:32 Tower kernel: sas: ata10: end_device-1:3: dev error handler Jan 29 21:52:32 Tower kernel: sas: ata11: end_device-1:4: dev error handler Jan 29 21:52:32 Tower kernel: ata11.00: request sense failed stat 50 emask 0 Jan 29 21:52:32 Tower kernel: sas: --- Exit sas_scsi_recover_host: busy: 0 failed: 1 tries: 1 Jan 29 21:52:32 Tower kernel: sas: Enter sas_scsi_recover_host busy: 1 failed: 1 Jan 29 21:52:32 Tower kernel: sas: ata11: end_device-1:4: cmd error handler Jan 29 21:52:32 Tower kernel: sas: ata7: end_device-1:0: dev error handler Jan 29 21:52:32 Tower kernel: sas: ata8: end_device-1:1: dev error handler Jan 29 21:52:32 Tower kernel: sas: ata9: end_device-1:2: dev error handler Jan 29 21:52:32 Tower kernel: sas: ata10: end_device-1:3: dev error handler Jan 29 21:52:32 Tower kernel: sas: ata11: end_device-1:4: dev error handler Jan 29 21:52:32 Tower kernel: ata11.00: request sense failed stat 50 emask 0 Jan 29 21:52:32 Tower kernel: sas: --- Exit sas_scsi_recover_host: busy: 0 failed: 1 tries: 1 Jan 29 21:52:32 Tower kernel: sas: Enter sas_scsi_recover_host busy: 1 failed: 1 Jan 29 21:52:32 Tower kernel: sas: ata7: end_device-1:0: cmd error handler Jan 29 21:52:32 Tower kernel: sas: ata7: end_device-1:0: dev error handler Jan 29 21:52:32 Tower kernel: sas: ata8: end_device-1:1: dev error handler Jan 29 21:52:32 Tower kernel: sas: ata9: end_device-1:2: dev error handler Jan 29 21:52:32 Tower kernel: sas: ata10: end_device-1:3: dev error handler Jan 29 21:52:32 Tower kernel: sas: ata11: end_device-1:4: dev error handler Jan 29 21:52:32 Tower kernel: ata7.00: request sense failed stat 50 emask 0 Jan 29 21:52:32 Tower kernel: sas: --- Exit sas_scsi_recover_host: busy: 0 failed: 1 tries: 1 Jan 29 21:52:32 Tower kernel: sas: Enter sas_scsi_recover_host busy: 1 failed: 1 Jan 29 21:52:32 Tower kernel: sas: ata7: end_device-1:0: cmd error handler Jan 29 21:52:32 Tower kernel: sas: ata7: end_device-1:0: dev error handler Jan 29 21:52:32 Tower kernel: sas: ata8: end_device-1:1: dev error handler Jan 29 21:52:32 Tower kernel: sas: ata9: end_device-1:2: dev error handler Jan 29 21:52:32 Tower kernel: sas: ata10: end_device-1:3: dev error handler Jan 29 21:52:32 Tower kernel: sas: ata11: end_device-1:4: dev error handler Jan 29 21:52:32 Tower kernel: ata7.00: request sense failed stat 50 emask 0 Anyone any idea ?
February 4, 20179 yr Community Expert Similar to the error that is logged when the SASLP or SAS2LP has issues and drops one or more disks, in this case it recovered before more harm was done, a typical more serious error looks like this: Jan 7 10:54:09 Tower kernel: sas: Enter sas_scsi_recover_host busy: 32 failed: 32 Jan 7 10:54:09 Tower kernel: sas: trying to find task 0xffff880753351600 Jan 7 10:54:09 Tower kernel: sas: sas_scsi_find_task: aborting task 0xffff880753351600 Jan 7 10:54:09 Tower kernel: sas: sas_scsi_find_task: task 0xffff880753351600 is aborted Jan 7 10:54:09 Tower kernel: sas: sas_eh_handle_sas_errors: task 0xffff880753351600 is aborted When there's a problem the host busy number is usually > 1 (but it can also be 1) and there's an aborted task immediately below, after that disk(s) are dropped and sometimes the controller crashes.
February 4, 20179 yr Author That is what I suspected... I am going to change these cards out,, I do not want the hassle.. Any advice ?
Archived
This topic is now archived and is closed to further replies.