gamer1pc Posted February 2, 2021 Share Posted February 2, 2021 My parity disk became disabled and I started a parity check, but stopped it as I realized there's no point to it. I have a feeling it might be the sata cables or the sata controller on the motherboard as it happened to disk1 a day before and unraid hangs on "stopping array" for hours when I wanted to shutdown to inspect disk1. Which forced me to force shutdown the server. I have posted the diagnostics file before restarting. I really appreciate anyone who helps me figure this out. xio-diagnostics-20210201-1238.zip Quote Link to comment
JorgeB Posted February 2, 2021 Share Posted February 2, 2021 Errors in multiple disks: Feb 1 10:24:47 XIO kernel: ata5: softreset failed (1st FIS failed) Feb 1 10:24:47 XIO kernel: ata5: limiting SATA link speed to 3.0 Gbps Feb 1 10:24:47 XIO kernel: ata5: hard resetting link Feb 1 10:24:47 XIO kernel: ata6: softreset failed (1st FIS failed) Feb 1 10:24:47 XIO kernel: ata6: limiting SATA link speed to 3.0 Gbps Feb 1 10:24:47 XIO kernel: ata6: hard resetting link Feb 1 10:24:52 XIO kernel: ata6: softreset failed (1st FIS failed) Feb 1 10:24:52 XIO kernel: ata6: reset failed, giving up Feb 1 10:24:52 XIO kernel: ata6.00: disabled Feb 1 10:24:52 XIO kernel: ata6: EH complete Feb 1 10:25:17 XIO kernel: ata1: softreset failed (1st FIS failed) Feb 1 10:25:17 XIO kernel: ata1: hard resetting link Feb 1 10:25:17 XIO kernel: ata3: softreset failed (1st FIS failed) Feb 1 10:25:17 XIO kernel: ata3: hard resetting link Feb 1 10:25:17 XIO kernel: ata4: softreset failed (1st FIS failed) Feb 1 10:25:17 XIO kernel: ata4: hard resetting link Could be the typical AMD controller issue, or a power problem, there are also some IOMMU errors related to the GPU, if it's the former upgrading to v6.9 might help. Quote Link to comment
gamer1pc Posted February 3, 2021 Author Share Posted February 3, 2021 I'm going to update to 6.9 to see if it resolves these problems then. Just need to wait for the parity data rebuild to finish. Thanks for the advice Quote Link to comment
gamer1pc Posted February 3, 2021 Author Share Posted February 3, 2021 Hi, I upgraded the OS to 6.9rc2 after the data rebuild. Then rebooted and started the array, and now the Cache disks are showing as "Unmountable: No file system", I have attached the diagnostics and a picture. Should I reboot again and see if they get mounted this time? xio-diagnostics-20210203-0716.zip Quote Link to comment
JorgeB Posted February 3, 2021 Share Posted February 3, 2021 Cache filesystem is corrupt, likely the result of the previous issues, best to re-format and restore data from a backup, if there's important data and no backups there are some recovery options here. Quote Link to comment
gamer1pc Posted February 3, 2021 Author Share Posted February 3, 2021 Yeah the second Cache 2 disk was a mirror of Cache disk so I just swap them. Thank you once again for explaining what happened Quote Link to comment
gamer1pc Posted February 3, 2021 Author Share Posted February 3, 2021 Well thinking I was fine, I had to stop the array and the other cache became corrupted as well as I started it up again. I have a question regarding the recovery options you linked. Do these have to be performed while the array is started in normal mode, maintenance mode or not have the array started? Thanks in advanced Quote Link to comment
JorgeB Posted February 3, 2021 Share Posted February 3, 2021 If the cache is unmountable array can be started in normal mode. Quote Link to comment
gamer1pc Posted February 5, 2021 Author Share Posted February 5, 2021 I was able to restore everything back to normal on wednesday and everything was fine on Thursday, but this Friday afternoon. Parity got disabled and other disks were put to grey status as the picture shows. I couldn't stop the array since it was stuck so I had to do a hard shutdown. Could it be just a coincidence and had bad luck or does the diagnostics show the same thing again? xio-diagnostics-20210205-1458.zip Quote Link to comment
JorgeB Posted February 6, 2021 Share Posted February 6, 2021 9 hours ago, gamer1pc said: does the diagnostics show the same thing again? It doesn't show the beginning of the problem with the disks because there's a lot of other crashing, apparently related to IOMMU, if you don't need it try disabling it and/or look for a BIOS update. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.