deano_southafrican Posted June 18, 2022 Share Posted June 18, 2022 I took advantage of the sale and upgraded my license today from Basic to Pro. Part of that upgrade process required me to stop and restart the array. After stopping the array my server become somewhat uncontrollable via the WebUI. After a while I opted to reboot the server which it did. After that it took a while to come back up and immediately I started hearing a high pitched noise from one of my drives. Subsequent read errors appeared on one of the disks in the array followed by parity sync errors. My parity check normally takes about 3 hours to complete, however it's now on 25% after 6 and a half hours. The affected disk is one that only keeps my network backups and has no other data, fortunately! I have no idea when it comes to diagnosing disk issues or what exactly could have gone wrong but while the parity check is running I can't run the SMART test on the drive. What am I looking at here? Replace the disk? Something that can be repaired? I'm wanting to get this sorted out so I can safely update to the latest stable release (from 6.9.2). Any advice greatly appreciated. TLDR: Please see if there's anything badly wrong in the diagnostics? tower-diagnostics-20220618-2210.zip Quote Link to comment
trurl Posted June 18, 2022 Share Posted June 18, 2022 Your diagnostics don't open for me. How did you create them? Diagnostics is downloaded from the webUI as a single zip file, you don't need to change it in any way, or rezip it, or even open it unless you want to look at it yourself (I recommend). Quote Link to comment
deano_southafrican Posted June 18, 2022 Author Share Posted June 18, 2022 I've downloaded them straight from Tools > Diagnostics and haven't changed them in any way. I've tried downloading again... tower-diagnostics-20220618-2343.zip Quote Link to comment
trurl Posted June 18, 2022 Share Posted June 18, 2022 Looks like connection problems on disk3 Quote Link to comment
deano_southafrican Posted June 19, 2022 Author Share Posted June 19, 2022 12 hours ago, trurl said: Looks like connection problems on disk3 Connection meaning the physical connections? What gives you this impression so I know what to look for in the logs? I'll test cables and see if I can sort it out. Thanks for your help! Quote Link to comment
deano_southafrican Posted June 19, 2022 Author Share Posted June 19, 2022 So after reseating things and checking my server, I've booted up like normal and Disk3, the same disk giving the issues shows up as "Unmountable". In a short amount of research IO have found people talking about this: xfs_repair -v /dev/md3 and: I want to make sure I'm on the right track, I don't actually care about the data on this particular drive so happy to format it but I'd like to try get to the bottom of it all and test to see if the drive is failing or if it's just a coincidental error. Quote Link to comment
deano_southafrican Posted June 20, 2022 Author Share Posted June 20, 2022 20 hours ago, JorgeB said: Please post new diags. Still experiencing similar issues with Disk3. Parity is averaging 26Mb/s... tower-diagnostics-20220620-1032.zip Quote Link to comment
JorgeB Posted June 20, 2022 Share Posted June 20, 2022 Still constant ATA errors on disk3, replace cables, both SATA and power, then post new diags after array start. Quote Link to comment
deano_southafrican Posted June 20, 2022 Author Share Posted June 20, 2022 1 hour ago, JorgeB said: Still constant ATA errors on disk3 In the diagnostics, where do you see these errors, how are they represented? Thank you, once I'm home tonight I'll replace the cables and post new diagnostics. Quote Link to comment
deano_southafrican Posted June 20, 2022 Author Share Posted June 20, 2022 I'm guessing it's this bit? "Jun 20 04:40:14 Tower kernel: ata5.00: failed command: READ FPDMA QUEUED Jun 20 04:40:14 Tower kernel: ata5.00: cmd 60/40:f0:80:5d:45/05:00:39:00:00/40 tag 30 ncq dma 688128 in Jun 20 04:40:14 Tower kernel: res 40/00:10:50:6b:45/00:00:39:00:00/40 Emask 0x50 (ATA bus error)" Will follow your advice re: replacing cables and follow up. Quote Link to comment
JorgeB Posted June 20, 2022 Share Posted June 20, 2022 1 hour ago, deano_southafrican said: I'm guessing it's this bit? Yep Quote Link to comment
trurl Posted June 20, 2022 Share Posted June 20, 2022 2 hours ago, deano_southafrican said: I'm guessing it's this bit? and you can see which disk that is associated earlier in syslog1.txt Jun 19 13:37:23 Tower kernel: ata5.00: ATA-10: ST2000DM008-2FR102, WFL4DNV1, 0001, max UDMA/133 Quote Link to comment
trurl Posted June 20, 2022 Share Posted June 20, 2022 Unrelated, but your appdata, domains, system shares are on the array. Docker/VM performance will be impacted by slower parity, and array disks can't spin down since these files are always open Quote Link to comment
deano_southafrican Posted June 20, 2022 Author Share Posted June 20, 2022 2 hours ago, JorgeB said: Yep 1 hour ago, trurl said: and you can see which disk that is associated earlier in syslog1.txt Big thanks to you both, learning experience for sure! Have attached the new diagnostics after having replaced the cable. Power cable looked fine to me and is supporting another drive on the same strand so I've only replaced the data cable. It's now showing as "Unmountable: not mounted". tower-diagnostics-20220620-1804.zip Quote Link to comment
deano_southafrican Posted June 20, 2022 Author Share Posted June 20, 2022 1 hour ago, trurl said: Unrelated, but your appdata, domains, system shares are on the array. Docker/VM performance will be impacted by slower parity, and array disks can't spin down since these files are always open Thanks, I prioritised my VM over the other stuff as I only had a small SSD, my other (NVMe) SSD keeps unmounting so for stability had to move it all onto the array. Should be able to get a new SSD for this exact purpose soon as I've also just upgraded to the Pro license! Quote Link to comment
deano_southafrican Posted June 20, 2022 Author Share Posted June 20, 2022 (edited) 3 hours ago, JorgeB said: Check filesystem. Absolute legend! This seems to have sorted it out, Parity running at normal speeds! Couldn't have done it without you guys @JorgeB @trurl! Edit: with ---> without... oops! Edited June 20, 2022 by deano_southafrican 1 Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.