tmembrino Posted April 12, 2022 Share Posted April 12, 2022 Hoping for advice on next steps to resolve a persistent issue with Disk-5 failing on my server. I've been working on this for several weeks. The first failure I saw was an increase in reallocated sector counts and reported incorrect - which I figured was signs of a failing drive. This was a 8TB drive that was from a prior RMA so I suspected it truly was a drive issue and pulled it to replace with a new retail 8TB drive. I rebuilt the array on this new drive without issue. A few weeks in that new retail 8TB drive is starting showing errors and the server disabled it. This happened about a week ago and I figured I'd try to rebuild onto that same 8TB drive but swapping out the SATA cable (figured low chance of success but wanted to rule out a cable issue). Not surprisingly I'm getting the same failure now with that drive being disabled. I've attached diagnostics. I'm fine with buying another new drive but skeptical since this that would be the 3rd drive used in same location on this server and I'm wondering if my luck is truly that bad or if there's a different problem? Maybe this is an issue with the SATA port? The mobo has 8 SATA ports and this is the 7th port I'm actively using. Also checked into mobo bios and I'm many versions behind so maybe something there. Appreciate any advice from those smarter than me since I'm sort of attacking this blindly right now. Thanks! tick-diagnostics-20220412-0834.zip Quote Link to comment
JorgeB Posted April 12, 2022 Share Posted April 12, 2022 Initially it's logged as a disk problem, but then the disk dropped offline so there's no SMART, reboot/power cycle the server to see if the disk comes back online and post new diags. Quote Link to comment
tmembrino Posted April 12, 2022 Author Share Posted April 12, 2022 Thanks so much for your time and advice! I just rebooted and server came up with disk-5 still disabled but I'm getting SMART alerts now. Hopefully a bit more valuable info in the attached diagnostics following the reboot. tick-diagnostics-20220412-1235.zip Quote Link to comment
JorgeB Posted April 12, 2022 Share Posted April 12, 2022 SMART is showing some issues, also logged the read error, you should run an extended SMART test. Quote Link to comment
tmembrino Posted April 12, 2022 Author Share Posted April 12, 2022 Thanks once again for quick response and support. Really appreciate your time! I ran extended SMART test and result is: "Interrupted (host reset)". I downloaded SMART report (attached) and also attached a fresh download of Diagnostics. tick-smart-20220412-1245.zip tick-diagnostics-20220412-1338.zip Quote Link to comment
JorgeB Posted April 12, 2022 Share Posted April 12, 2022 2 minutes ago, tmembrino said: "Interrupted (host reset)". Disable disk spin down and try again. Quote Link to comment
tmembrino Posted April 12, 2022 Author Share Posted April 12, 2022 I set disk-5 spin down delay to Never and reran the extended SMART test. Same result: "Interrupted (host reset)". I kept an eye on it while running and it never seemed to get past 10%. I've attached SMART log for what it's worth. Maybe a SATA controller issue? I'm using all mobo based SATA ports (no add-in cards at this time). tick-smart-20220412-1342.zip Quote Link to comment
JorgeB Posted April 13, 2022 Share Posted April 13, 2022 12 hours ago, tmembrino said: Same result: "Interrupted (host reset)" That usually means something is interrupting the test, should not be controller related. Quote Link to comment
tmembrino Posted April 13, 2022 Author Share Posted April 13, 2022 Thanks again - appreciate your input on this. I'll have to investigate more when I'm physically near the server (I had to head out of town for a bit). Regarding possible causes of interruption I'm thinking can't hurt to update mobo bios and maybe look at power supply as potential cause? Server was rock solid until I added this 5th drive and the PS is a bit aged so maybe something intermittent there? Quote Link to comment
JorgeB Posted April 13, 2022 Share Posted April 13, 2022 4 minutes ago, tmembrino said: Regarding possible causes of interruption I'm thinking can't hurt to update mobo bios and maybe look at power supply as potential cause? Those don't seem very likely, but honestly except for spin down not sure what else could cause that, if you have another PC available you can try running it there. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.