April 12, 20233 yr Recently I have been having issues with my cache disks. All of my docker services stop and then I see a large number of btrfs errors on one of the disk logs. In order for it to be seen again I have to shut down the server open the lid disconnect the sata cables, reconnect and it is fine for a random amount of time before it happens again. It's usually at least a day. I have attached my diagnostics file. I am running a Dell R720 with all 8 bays filled with 20TB Exos drives. My cache disks are connected to the onboard sata ports (they are only 3gbps). I tried using an extra Mini-SAS port but it was causing issues with my disk in slot 0 for some reason and I am wondering if this is the same issue. It appears that the SSD is losing power thus losing connectivity, but am not skilled enough at reading the diag reports to be able to tell. Thank you! diagnostics-20230412-1516.zip
April 13, 20233 yr Community Expert Solution Apr 12 09:09:34 Hamm-NAS kernel: ata5: hard resetting link Apr 12 09:09:39 Hamm-NAS kernel: ata5: COMRESET failed (errno=-16) Apr 12 09:09:39 Hamm-NAS kernel: ata5: reset failed, giving up Apr 12 09:09:39 Hamm-NAS kernel: ata5.00: disable device Apr 12 09:09:39 Hamm-NAS kernel: ata5: EH complete Cache1 dropped offline, this is usually a power/connection problem.
May 22, 20233 yr Author This solved it. I had a sata power splitter on those SSD's. They came in a 2 pack and I swapped to the 2nd one and haven't had issues since then. Thanks!
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.