October 28, 201411 yr I had a failed 2TB drive which I was waiting on the replacement for. I swapped it with a new 3TB drive yesterday afternoon as I have upgraded my parity and began using 3TB drives as I filled my chassis and started a rebuild of drive 11. Woke up to everything being unresponsive and this in the syslog. Telnet is still responsive to the server but that's about it. md: recovery thread woken up ... md: recovery thread rebuilding disk11 ... md: using 1536k window, over a total of 2930266532 blocks. sd 8:0:3:0: [sdq] command ecb28c00 timed out sd 8:0:3:0: [sdq] command ecb28300 timed out sd 8:0:3:0: [sdq] command ecb28840 timed out sd 8:0:3:0: [sdq] command f76bbe40 timed out sd 8:0:3:0: [sdq] command f239d9c0 timed out sas: Enter sas_scsi_recover_host busy: 5 failed: 5 sas: trying to find task 0xf1c8e000 sas: sas_scsi_find_task: aborting task 0xf1c8e000 sas: sas_scsi_find_task: task 0xf1c8e000 is aborted sas: sas_eh_handle_sas_errors: task 0xf1c8e000 is aborted sas: trying to find task 0xf1c8e200 sas: sas_scsi_find_task: aborting task 0xf1c8e200 sas: sas_scsi_find_task: task 0xf1c8e200 is aborted sas: sas_eh_handle_sas_errors: task 0xf1c8e200 is aborted sas: trying to find task 0xf1c8e900 sas: sas_scsi_find_task: aborting task 0xf1c8e900 sas: sas_scsi_find_task: task 0xf1c8e900 is aborted sas: sas_eh_handle_sas_errors: task 0xf1c8e900 is aborted sas: trying to find task 0xf1c3cc00 sas: sas_scsi_find_task: aborting task 0xf1c3cc00 sas: sas_scsi_find_task: task 0xf1c3cc00 is aborted sas: sas_eh_handle_sas_errors: task 0xf1c3cc00 is aborted sas: trying to find task 0xf1c8ec00 sas: sas_scsi_find_task: aborting task 0xf1c8ec00 sas: sas_scsi_find_task: task 0xf1c8ec00 is aborted sas: sas_eh_handle_sas_errors: task 0xf1c8ec00 is aborted sas: ata18: end_device-8:3: cmd error handler sas: ata15: end_device-8:0: dev error handler sas: ata16: end_device-8:1: dev error handler sas: ata17: end_device-8:2: dev error handler sas: ata18: end_device-8:3: dev error handler ata18.00: exception Emask 0x0 SAct 0x3e SErr 0x0 action 0x6 frozen sas: ata19: end_device-8:4: dev error handler ata18.00: failed command: WRITE FPDMA QUEUED sas: ata20: end_device-8:5: dev error handler ata18.00: cmd 61/00:00:78:99:b2/02:00:a3:00:00/40 tag 1 ncq 262144 out res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) ata18.00: status: { DRDY } ata18.00: failed command: WRITE FPDMA QUEUED ata18.00: cmd 61/00:00:78:a1:b2/02:00:a3:00:00/40 tag 2 ncq 262144 out res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) ata18.00: status: { DRDY } ata18.00: failed command: WRITE FPDMA QUEUED ata18.00: cmd 61/00:00:78:9b:b2/02:00:a3:00:00/40 tag 3 ncq 262144 out res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) sas: ata21: end_device-8:6: dev error handler ata18.00: status: { DRDY } ata18.00: failed command: WRITE FPDMA QUEUED ata18.00: cmd 61/00:00:78:9d:b2/02:00:a3:00:00/40 tag 4 ncq 262144 out res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) ata18.00: status: { DRDY } ata18.00: failed command: WRITE FPDMA QUEUED ata18.00: cmd 61/00:00:78:9f:b2/02:00:a3:00:00/40 tag 5 ncq 262144 out res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) ata18.00: status: { DRDY } ata18: hard resetting link drivers/scsi/mvsas/mv_sas.c 1527:mvs_I_T_nexus_reset for device[3]:rc= 0 sas: sas_ata_task_done: SAS error 8a sas: sas_ata_task_done: SAS error 8a ata18.00: both IDENTIFYs aborted, assuming NODEV ata18.00: revalidation failed (errno=-2) mvsas 0000:02:00.0: Phy4 : No sig fis sas: sas_form_port: phy4 belongs to port3 already(1)! ata18: hard resetting link ata18.00: configured for UDMA/133 ata18.00: device reported invalid CHS sector 0 ata18.00: device reported invalid CHS sector 0 ata18.00: device reported invalid CHS sector 0 ata18.00: device reported invalid CHS sector 0 ata18.00: device reported invalid CHS sector 0 ata18: EH complete sas: --- Exit sas_scsi_recover_host: busy: 0 failed: 0 tries: 1 md: sync done. time=43583sec md: recovery thread sync completion status: 0 Not sure what my next step should be at this point.
Archived
This topic is now archived and is closed to further replies.