elnorte13524 Posted April 25, 2023 Share Posted April 25, 2023 I got an email out of the blue today that my disk 4 was disabled. Its a slightly older disk, maybe 2 years, so im not sure if i just need to give up on it and replace it, or if i can enable it and ride it out. ive got 6 disks in my array, 2 of which are parity drives. Browsing the forum, it looks like its best to start a new thread, rather than post on an existing one, with the diagnositcs attached. So ive attached the diagnositcs and am hoping for some guidance on the disk. I just build the new server and added 2 16TB drives, so cash is tight right now, which is why im hoping i can survive with that disk for a little while, rather than dropping another couple hundred bucks to replace the 14 TB drive that might be failing. Ive also attached the quick smart test from that drive. I dont know what to look for, but i see there are some CRC errors, the count is 6 and i think it was 4 a while ago, so its very slowly incrementing. Ive heard that could be cabling, so when i get time ill replace the SATA cable, but itll be a day or two before i can get to that, so i wanted to see if there was anything else i needed to look at or be concerned with before then. Any help would be much appreciated! tower-diagnostics-20230425-1323.zip tower-smart-20230425-1109.zip Quote Link to comment
JorgeB Posted April 26, 2023 Share Posted April 26, 2023 Disk was already disabled at boot so we cannot see what happened, but SMART looks fine, some CRC errors, so possibly just a bad SATA cable, replace that and rebuild. Quote Link to comment
elnorte13524 Posted April 26, 2023 Author Share Posted April 26, 2023 ok, thanks. my server locked up yesterday when i wasnt home and wasnt even responsive to the keyboard connected to it. I could ping it, but SSH wouldnt work and the web interface refused the connection, so i had to kill power, whcih of course triggered a parity check when i fired it back up. once that finishes ill shut it down and replace the SATA cable. i have 2 other drives with the same error, though the count seems to be staiying the same and the other 2 are still working fine. i panicked and ordered 3 drives yesterday since one was disabled i was worried the other 2 might not be far behind. Is it worth hanging onto them for a couple of weeks before sending them back in case i do have a disk die, or is it pretty likely that its just the SATA cable and it should be an easy fix? Quote Link to comment
itimpi Posted April 26, 2023 Share Posted April 26, 2023 CRC errors are rarely the drive - the cabling is nearly always the culprit. Do not forget - it can also be the power cabling (or even the PSU itself) and not just the SATA cabling. Quote Link to comment
trurl Posted April 26, 2023 Share Posted April 26, 2023 1 hour ago, elnorte13524 said: the count seems to be staiying the same You can acknowledge the current count by clicking on the SMART warning ( 👎 ) for the drive on the Dashboard page, and it will warn again if it increases. Quote Link to comment
elnorte13524 Posted April 27, 2023 Author Share Posted April 27, 2023 Im not sure if this warrants a new post, but since its related ill add here for now. i replaced the SATA cables on drives 3 and 4, reseated connections on the other drives to make sure everything was secure. Fired it up and started a parity rebuild last night, but about 745am today, it stopped and disabled disks 3 and 4. It made me start a read check and now im getting a ton of errors on disk 2. things on the array are running fine for now, but im worried im about to lose my whole array. This is the 2nd or 3rd time ive had the parity rebuild fail. Are my disks dying? Do i have crappy sata cables? I have disks connected to the motherboard and to a SATA riser, but have disks working well on both connection types, all the hardware except the disks is brand new and it all worked fine on my last machine (before i shucked the drives), but it was slower. any help would be much appreciated! im losing my mind, worried that im going to lose my whole array. tower-diagnostics-20230427-1504.zip Quote Link to comment
JorgeB Posted April 28, 2023 Share Posted April 28, 2023 Looks like a controller issue: Apr 27 13:11:40 Tower kernel: ata9.15: failed to reset PMP, giving up Apr 27 13:11:40 Tower kernel: ata9.15: Port Multiplier detaching Avoid controllers with SATA port multipliers. Quote Link to comment
elnorte13524 Posted April 28, 2023 Author Share Posted April 28, 2023 Damn. I had seen a few similar recommendations after I built this machine. So with only 4 sata ports on my motherboard, is there another way to use 6-8 drives? Quote Link to comment
elnorte13524 Posted April 28, 2023 Author Share Posted April 28, 2023 Awesome, thanks! I'll order something today and hope to get it installed this weekend. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.