February 2, 20251 yr Hello, I recently added two 20TB drives to my array to server as parity. I replaced the existing 2 x 14 TB parity drives. I replaced one by one. No problems after both new parity drives were upgraded. I added one 14TB drive (former parity) to the data array and no issues for about three days. I precleared the second unassigned 14TB parity drive. I added to the data array within the past 24 hours. I noticed when trying to update some missing media that some of my media folders were missing from the shares. I checked drive health and all are fine but eight are showing read errors. When I check the disk logs on one of the drives with errors I see the following: I see the same "Synchronize Cache[10] failed" error on the eight drives with errors. Only this one drive shows I/O errors. I thought I may have over-passed the 30 drive limit but believe I should be fine. 2 x parity 24 x array 2 x cache pool 1 x usb = 29 drives I have a main case and a DAS. I don't believe this is a hardware issue related to that. I haven't stopped the array yet but did stop running dockers. Can someone please help me review? Attaching diagnostics logs. Thanks tower-diagnostics-20250202-1437.zip
February 2, 20251 yr Community Expert 31 minutes ago, Dradder1 said: eight are showing read errors What do those have in common?
February 2, 20251 yr Author I was looking at the diagnostics and see the following entries: Feb 2 13:43:33 Tower kernel: mpt2sas_cm1: SAS host is non-operational !!!! ### [PREVIOUS LINE REPEATED 5 TIMES] ### Feb 2 13:43:38 Tower kernel: mpt2sas_cm1: _base_fault_reset_work: Running mpt3sas_dead_ioc thread success !!!! Feb 2 13:43:38 Tower kernel: sd 18:0:0:0: [sdv] Synchronizing SCSI cache Feb 2 13:43:38 Tower kernel: sd 18:0:0:0: [sdv] Synchronize Cache(10) failed: Result: hostbyte=0x01 driverbyte=DRIVER_OK Feb 2 13:43:38 Tower kernel: sd 18:0:1:0: [sdw] Synchronizing SCSI cache Feb 2 13:43:38 Tower kernel: sd 18:0:1:0: [sdw] Synchronize Cache(10) failed: Result: hostbyte=0x01 driverbyte=DRIVER_OK Feb 2 13:43:38 Tower kernel: sd 18:0:2:0: [sdx] Synchronizing SCSI cache Feb 2 13:43:38 Tower kernel: sd 18:0:2:0: [sdx] Synchronize Cache(10) failed: Result: hostbyte=0x01 driverbyte=DRIVER_OK Feb 2 13:43:38 Tower kernel: sd 18:0:3:0: [sdy] Synchronizing SCSI cache Feb 2 13:43:38 Tower kernel: sd 18:0:3:0: [sdy] Synchronize Cache(10) failed: Result: hostbyte=0x01 driverbyte=DRIVER_OK Feb 2 13:43:38 Tower kernel: sd 18:0:4:0: [sdz] Synchronizing SCSI cache Feb 2 13:43:38 Tower kernel: sd 18:0:4:0: [sdz] Synchronize Cache(10) failed: Result: hostbyte=0x01 driverbyte=DRIVER_OK Feb 2 13:43:38 Tower kernel: sd 18:0:5:0: [sdaa] Synchronizing SCSI cache Feb 2 13:43:38 Tower kernel: sd 18:0:5:0: [sdaa] Synchronize Cache(10) failed: Result: hostbyte=0x01 driverbyte=DRIVER_OK Feb 2 13:43:39 Tower kernel: sd 18:0:6:0: [sdab] Synchronizing SCSI cache Feb 2 13:43:39 Tower kernel: sd 18:0:6:0: [sdab] Synchronize Cache(10) failed: Result: hostbyte=0x01 driverbyte=DRIVER_OK Feb 2 13:43:39 Tower kernel: sd 18:0:7:0: [sdac] Synchronizing SCSI cache Feb 2 13:43:39 Tower kernel: sd 18:0:7:0: [sdac] Synchronize Cache(10) failed: Result: hostbyte=0x01 driverbyte=DRIVER_OK I think these drives are connected to the same SAS card. I found some forum posts dealing with the same diagnostics log entries. I'll power down and try to re-seat the card. I don't believe I have an available PCIE slot to use but do have a much newer board/cpu combo I meant to swap in to replace the old hardware running this system. I'll try those steps out.
February 3, 20251 yr Author Solution I re-seated the PCIE card and I see the device appeared under System Devices whereas before I shutdown it did not. The drives look fine and I can see the missing media shares that were not there pre-reboot and re-seating of both the PCIE card and sas-to-sata cables. I hope this is a one off but I will continue to monitor. Is it worth a parity check on my array? I may have had some media added to the array before I noticed the errors. There were no errors in the last parity check a month back.
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.