caplam Posted October 24, 2020 Author Share Posted October 24, 2020 so i put a third disk. This time it's a iron wolf (3 pass precleared) which has already 20k hours. Rebuilding has started. Quote Link to comment
caplam Posted October 24, 2020 Author Share Posted October 24, 2020 i continue to see sas errors in log: Oct 24 19:37:26 godzilla kernel: sas: Enter sas_scsi_recover_host busy: 9 failed: 9 Oct 24 19:37:26 godzilla kernel: sas: trying to find task 0x00000000b6e063ac Oct 24 19:37:26 godzilla kernel: sas: sas_scsi_find_task: aborting task 0x00000000b6e063ac Oct 24 19:37:26 godzilla kernel: isci 0000:02:00.0: isci_task_abort_task: dev = 00000000ae73c6a0 (STP/SATA), task = 00000000b6e063ac, old_request == 00000000faf8b36b Oct 24 19:37:26 godzilla kernel: isci 0000:02:00.0: isci_task_abort_task: SATA/STP request or complete_in_target (1), or IDEV_GONE (0), thus no TMF Oct 24 19:37:26 godzilla kernel: isci 0000:02:00.0: isci_task_abort_task: Done; dev = 00000000ae73c6a0, task = 00000000b6e063ac , old_request == 00000000faf8b36b Oct 24 19:37:26 godzilla kernel: sas: sas_scsi_find_task: task 0x00000000b6e063ac is done Oct 24 19:37:26 godzilla kernel: sas: sas_eh_handle_sas_errors: task 0x00000000b6e063ac is done Oct 24 19:37:26 godzilla kernel: sas: trying to find task 0x000000001fb3ea36 Oct 24 19:37:26 godzilla kernel: sas: sas_scsi_find_task: aborting task 0x000000001fb3ea36 Oct 24 19:37:26 godzilla kernel: isci 0000:02:00.0: isci_task_abort_task: dev = 00000000ae73c6a0 (STP/SATA), task = 000000001fb3ea36, old_request == 00000000f191f1ed Oct 24 19:37:26 godzilla kernel: isci 0000:02:00.0: isci_task_abort_task: SATA/STP request or complete_in_target (1), or IDEV_GONE (0), thus no TMF Oct 24 19:37:26 godzilla kernel: isci 0000:02:00.0: isci_task_abort_task: Done; dev = 00000000ae73c6a0, task = 000000001fb3ea36 , old_request == 00000000f191f1ed Oct 24 19:37:26 godzilla kernel: sas: sas_scsi_find_task: task 0x000000001fb3ea36 is done Oct 24 19:37:26 godzilla kernel: sas: sas_eh_handle_sas_errors: task 0x000000001fb3ea36 is done Oct 24 19:37:26 godzilla kernel: sas: trying to find task 0x0000000029adb82b Oct 24 19:37:26 godzilla kernel: sas: sas_scsi_find_task: aborting task 0x0000000029adb82b Oct 24 19:37:26 godzilla kernel: isci 0000:02:00.0: isci_task_abort_task: dev = 00000000ae73c6a0 (STP/SATA), task = 0000000029adb82b, old_request == 0000000057d81343 Oct 24 19:37:26 godzilla kernel: isci 0000:02:00.0: isci_task_abort_task: SATA/STP request or complete_in_target (1), or IDEV_GONE (0), thus no TMF Oct 24 19:37:26 godzilla kernel: isci 0000:02:00.0: isci_task_abort_task: Done; dev = 00000000ae73c6a0, task = 0000000029adb82b , old_request == 0000000057d81343 Oct 24 19:37:26 godzilla kernel: sas: sas_scsi_find_task: task 0x0000000029adb82b is done Oct 24 19:37:26 godzilla kernel: sas: sas_eh_handle_sas_errors: task 0x0000000029adb82b is done Oct 24 19:37:26 godzilla kernel: sas: trying to find task 0x000000001f95191f Oct 24 19:37:26 godzilla kernel: sas: sas_scsi_find_task: aborting task 0x000000001f95191f Oct 24 19:37:26 godzilla kernel: isci 0000:02:00.0: isci_task_abort_task: dev = 00000000ae73c6a0 (STP/SATA), task = 000000001f95191f, old_request == 00000000a10e1ea9 Oct 24 19:37:26 godzilla kernel: isci 0000:02:00.0: isci_task_abort_task: SATA/STP request or complete_in_target (1), or IDEV_GONE (0), thus no TMF Oct 24 19:37:26 godzilla kernel: isci 0000:02:00.0: isci_task_abort_task: Done; dev = 00000000ae73c6a0, task = 000000001f95191f , old_request == 00000000a10e1ea9 Oct 24 19:37:26 godzilla kernel: sas: sas_scsi_find_task: task 0x000000001f95191f is done Oct 24 19:37:26 godzilla kernel: sas: sas_eh_handle_sas_errors: task 0x000000001f95191f is done Oct 24 19:37:26 godzilla kernel: sas: trying to find task 0x000000005c2436ca Oct 24 19:37:26 godzilla kernel: sas: sas_scsi_find_task: aborting task 0x000000005c2436ca Oct 24 19:37:26 godzilla kernel: isci 0000:02:00.0: isci_task_abort_task: dev = 00000000ae73c6a0 (STP/SATA), task = 000000005c2436ca, old_request == 000000009da361b3 Oct 24 19:37:26 godzilla kernel: isci 0000:02:00.0: isci_task_abort_task: SATA/STP request or complete_in_target (1), or IDEV_GONE (0), thus no TMF Oct 24 19:37:26 godzilla kernel: isci 0000:02:00.0: isci_task_abort_task: Done; dev = 00000000ae73c6a0, task = 000000005c2436ca , old_request == 000000009da361b3 Oct 24 19:37:26 godzilla kernel: sas: sas_scsi_find_task: task 0x000000005c2436ca is done Oct 24 19:37:26 godzilla kernel: sas: sas_eh_handle_sas_errors: task 0x000000005c2436ca is done Oct 24 19:37:26 godzilla kernel: sas: trying to find task 0x0000000023c5b3a7 Oct 24 19:37:26 godzilla kernel: sas: sas_scsi_find_task: aborting task 0x0000000023c5b3a7 Oct 24 19:37:26 godzilla kernel: isci 0000:02:00.0: isci_task_abort_task: dev = 00000000ae73c6a0 (STP/SATA), task = 0000000023c5b3a7, old_request == 0000000047e60ca1 Oct 24 19:37:26 godzilla kernel: isci 0000:02:00.0: isci_task_abort_task: SATA/STP request or complete_in_target (1), or IDEV_GONE (0), thus no TMF Oct 24 19:37:26 godzilla kernel: isci 0000:02:00.0: isci_task_abort_task: Done; dev = 00000000ae73c6a0, task = 0000000023c5b3a7 , old_request == 0000000047e60ca1 Oct 24 19:37:26 godzilla kernel: sas: sas_scsi_find_task: task 0x0000000023c5b3a7 is done Oct 24 19:37:26 godzilla kernel: sas: sas_eh_handle_sas_errors: task 0x0000000023c5b3a7 is done Oct 24 19:37:26 godzilla kernel: sas: trying to find task 0x000000006facca72 Oct 24 19:37:26 godzilla kernel: sas: sas_scsi_find_task: aborting task 0x000000006facca72 Oct 24 19:37:26 godzilla kernel: isci 0000:02:00.0: isci_task_abort_task: dev = 00000000ae73c6a0 (STP/SATA), task = 000000006facca72, old_request == 0000000024adc4d9 Oct 24 19:37:26 godzilla kernel: isci 0000:02:00.0: isci_task_abort_task: SATA/STP request or complete_in_target (1), or IDEV_GONE (0), thus no TMF Oct 24 19:37:26 godzilla kernel: isci 0000:02:00.0: isci_task_abort_task: Done; dev = 00000000ae73c6a0, task = 000000006facca72 , old_request == 0000000024adc4d9 Oct 24 19:37:26 godzilla kernel: sas: sas_scsi_find_task: task 0x000000006facca72 is done Oct 24 19:37:26 godzilla kernel: sas: sas_eh_handle_sas_errors: task 0x000000006facca72 is done Oct 24 19:37:26 godzilla kernel: sas: trying to find task 0x00000000520d3197 Oct 24 19:37:26 godzilla kernel: sas: sas_scsi_find_task: aborting task 0x00000000520d3197 Oct 24 19:37:26 godzilla kernel: isci 0000:02:00.0: isci_task_abort_task: dev = 00000000ae73c6a0 (STP/SATA), task = 00000000520d3197, old_request == 00000000d6f7823f Oct 24 19:37:26 godzilla kernel: isci 0000:02:00.0: isci_task_abort_task: SATA/STP request or complete_in_target (1), or IDEV_GONE (0), thus no TMF Oct 24 19:37:26 godzilla kernel: isci 0000:02:00.0: isci_task_abort_task: Done; dev = 00000000ae73c6a0, task = 00000000520d3197 , old_request == 00000000d6f7823f Oct 24 19:37:26 godzilla kernel: sas: sas_scsi_find_task: task 0x00000000520d3197 is done Oct 24 19:37:26 godzilla kernel: sas: sas_eh_handle_sas_errors: task 0x00000000520d3197 is done Oct 24 19:37:26 godzilla kernel: sas: trying to find task 0x0000000039adefa1 Oct 24 19:37:26 godzilla kernel: sas: sas_scsi_find_task: aborting task 0x0000000039adefa1 Oct 24 19:37:26 godzilla kernel: isci 0000:02:00.0: isci_task_abort_task: dev = 00000000ae73c6a0 (STP/SATA), task = 0000000039adefa1, old_request == 000000001462f4c3 Oct 24 19:37:26 godzilla kernel: isci 0000:02:00.0: isci_task_abort_task: SATA/STP request or complete_in_target (1), or IDEV_GONE (0), thus no TMF Oct 24 19:37:26 godzilla kernel: isci 0000:02:00.0: isci_task_abort_task: Done; dev = 00000000ae73c6a0, task = 0000000039adefa1 , old_request == 000000001462f4c3 Oct 24 19:37:26 godzilla kernel: sas: sas_scsi_find_task: task 0x0000000039adefa1 is done Oct 24 19:37:26 godzilla kernel: sas: sas_eh_handle_sas_errors: task 0x0000000039adefa1 is done Oct 24 19:37:26 godzilla kernel: sas: ata10: end_device-9:0: cmd error handler Oct 24 19:37:26 godzilla kernel: sas: ata10: end_device-9:0: dev error handler Oct 24 19:37:26 godzilla kernel: ata10.00: exception Emask 0x0 SAct 0x3fe00 SErr 0x0 action 0x6 frozen Oct 24 19:37:26 godzilla kernel: ata10.00: failed command: READ FPDMA QUEUED Oct 24 19:37:26 godzilla kernel: ata10.00: cmd 60/98:00:38:22:0b/02:00:00:00:00/40 tag 9 ncq dma 339968 in Oct 24 19:37:26 godzilla kernel: res 40/00:01:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) Oct 24 19:37:26 godzilla kernel: ata10.00: status: { DRDY } Oct 24 19:37:26 godzilla kernel: ata10.00: failed command: READ FPDMA QUEUED Oct 24 19:37:26 godzilla kernel: ata10.00: cmd 60/98:00:d0:24:0b/01:00:00:00:00/40 tag 10 ncq dma 208896 in Oct 24 19:37:26 godzilla kernel: res 40/00:00:01:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) Oct 24 19:37:26 godzilla kernel: ata10.00: status: { DRDY } Oct 24 19:37:26 godzilla kernel: ata10.00: failed command: READ FPDMA QUEUED Oct 24 19:37:26 godzilla kernel: sas: ata14: end_device-9:1: dev error handler Oct 24 19:37:26 godzilla kernel: ata10.00: cmd 60/38:00:68:26:0b/02:00:00:00:00/40 tag 11 ncq dma 290816 in Oct 24 19:37:26 godzilla kernel: res 40/00:01:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) Oct 24 19:37:26 godzilla kernel: sas: ata15: end_device-9:2: dev error handler Oct 24 19:37:26 godzilla kernel: ata10.00: status: { DRDY } Oct 24 19:37:26 godzilla kernel: ata10.00: failed command: READ FPDMA QUEUED Oct 24 19:37:26 godzilla kernel: ata10.00: cmd 60/08:00:f8:87:c1/00:00:2a:00:00/40 tag 12 ncq dma 4096 in Oct 24 19:37:26 godzilla kernel: res 40/00:01:01:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) Oct 24 19:37:26 godzilla kernel: ata10.00: status: { DRDY } Oct 24 19:37:26 godzilla kernel: ata10.00: failed command: READ FPDMA QUEUED Oct 24 19:37:26 godzilla kernel: ata10.00: cmd 60/28:00:a0:28:0b/01:00:00:00:00/40 tag 13 ncq dma 151552 in Oct 24 19:37:26 godzilla kernel: res 40/00:00:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) Oct 24 19:37:26 godzilla kernel: ata10.00: status: { DRDY } Oct 24 19:37:26 godzilla kernel: ata10.00: failed command: READ FPDMA QUEUED Oct 24 19:37:26 godzilla kernel: ata10.00: cmd 60/70:00:c8:29:0b/00:00:00:00:00/40 tag 14 ncq dma 57344 in Oct 24 19:37:26 godzilla kernel: res 40/00:01:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) Oct 24 19:37:26 godzilla kernel: ata10.00: status: { DRDY } Oct 24 19:37:26 godzilla kernel: ata10.00: failed command: READ FPDMA QUEUED Oct 24 19:37:26 godzilla kernel: ata10.00: cmd 60/20:00:e8:04:62/00:00:5d:01:00/40 tag 15 ncq dma 16384 in Oct 24 19:37:26 godzilla kernel: res 40/00:00:00:00:00/00:00:00:00:00/40 Emask 0x4 (timeout) Oct 24 19:37:26 godzilla kernel: ata10.00: status: { DRDY } Oct 24 19:37:26 godzilla kernel: ata10.00: failed command: READ FPDMA QUEUED Oct 24 19:37:26 godzilla kernel: ata10.00: cmd 60/58:00:38:2a:0b/02:00:00:00:00/40 tag 16 ncq dma 307200 in Oct 24 19:37:26 godzilla kernel: res 40/00:00:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) Oct 24 19:37:26 godzilla kernel: ata10.00: status: { DRDY } Oct 24 19:37:26 godzilla kernel: ata10.00: failed command: READ FPDMA QUEUED Oct 24 19:37:26 godzilla kernel: ata10.00: cmd 60/50:00:90:2c:0b/00:00:00:00:00/40 tag 17 ncq dma 40960 in Oct 24 19:37:26 godzilla kernel: res 40/00:01:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) Oct 24 19:37:26 godzilla kernel: ata10.00: status: { DRDY } Oct 24 19:37:26 godzilla kernel: ata10: hard resetting link Oct 24 19:37:26 godzilla kernel: ata10.00: configured for UDMA/133 Oct 24 19:37:26 godzilla kernel: ata10: EH complete Oct 24 19:37:26 godzilla kernel: sas: --- Exit sas_scsi_recover_host: busy: 0 failed: 9 tries: 1 Quote Link to comment
JorgeB Posted October 25, 2020 Share Posted October 25, 2020 Full diags please. Quote Link to comment
caplam Posted October 25, 2020 Author Share Posted October 25, 2020 For now i have not access to my server.But rebuild started and i had still errors on sas controllers.So i rebooted and a new rebuild started.Now it’s finished and the array is ok. Disk2 is enabled. I have no more sas errors.As i seem very unlucky this week end the replacement disk throw some read error rate.I have messages that point to raw read error rate and few minutes or hours later it’s back to normal.I am also running a 2 passes preclear on the first replacement disk2. Quote Link to comment
caplam Posted October 25, 2020 Author Share Posted October 25, 2020 here are the diags. You only see errors on the former parity 2 disk. godzilla-diagnostics-20201025-1517.zip Quote Link to comment
JorgeB Posted October 26, 2020 Share Posted October 26, 2020 19 hours ago, caplam said: ou only see errors on the former parity 2 disk. Yep, rest looks fine for now. Quote Link to comment
caplam Posted October 26, 2020 Author Share Posted October 26, 2020 I thought i was fine but i'm definitely a "black cat" (french expression to say very unlucky). This night smbd process crashed during vm backup. This morning some dockers and some vm were unresponsive. I had to kill smbd to be able to stop array and reboot. It took me one hour. I hope this is a one time bug (i never had this before). Quote Link to comment
trurl Posted October 26, 2020 Share Posted October 26, 2020 Did you try to get diagnostics before rebooting? Quote Link to comment
caplam Posted October 26, 2020 Author Share Posted October 26, 2020 yes by ssh as it was impossible via gui. After reboot all is running perfectly well. I received a new hdd for parity 2 : currently preclearing godzilla-diagnostics-20201026-0941.zip Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.