Jump to content

Can't pinpoint my issue. Need help. Multiple SMART errors, missing disks, failed drives.


Go to solution Solved by barefootwarrior,

Recommended Posts

I'm feeling a bit defeated here...I can't pinpoint what the issue is on my server and it's starting to cause major concerns with my data.

 

I have multiple drives throwing SMART errors. I've had two fail within one month. Now I just rebooted my server three times and each time I have a different drive saying it's missing. I thought I knew what I was doing, but now after having to re-download an entire failed drive's worth of files and reinstall and reconfigure my appdata folder I question whether or not I really know what I'm doing at all.

 

I'm not sure if the diagnostics will show it, but I am using an LSI SAS 9207-8i PCI-E 3.0 as my motherboard does not have enough SATA ports.

 

Edit: I also added a SMART report for the most recent failed drive. I currently have two drives that are "missing" that weren't missing before a reboot.

 

barefootwarrior-diagnostics-20231201-1911.zip

barefootwarrior-smart-20231201-1936.zip

Edited by barefootwarrior
Added a bit of info.
Link to comment

Thanks for the input guys. Swapping the PSU today and I have new breakout SATA cables on the way.

Is there any way to get the disabled emulated drive back up and running?

I purchased the 14tb Seagate external drives and got two EXOS drives. I was hoping to upgrade my parity when the drive threw the errors and went disabled. I picked up two of those and was thinking of running them both as parity drives since I've been having so many issues recently. Eventually I'll upgrade the array, but I'm getting a bit tapped out financially at the moment.

Link to comment

Just finished that for the problem drive. Everything went smoothly, no errors for that drive.

Now I have a different drive that is emulated and says it threw some SMART errors. This is the drive that I swapped in for the other one that failed earlier this month, so it's brand new and wasn't giving me any issues prior to this.

Am I going to have to do yet another 10+ hour rebuild on top of that drive? And if so, am I stuck doing this perpetually? I could have simply redownloaded the contents faster than the previous rebuild.

What the heck is going on with this server?

 

barefootwarrior-smart-20231203-0858.zip

Edited by barefootwarrior
Link to comment
  • Solution

I just put in a brand new power supply that's overkill for my needs but supplies SATA power to all my drives without the needs for adapters. I got curious so I booted into my BIOS and noticed that my LSI adapter only showed 7 of 8 drives even in BIOS. I had four extra SATA ports on my MOBO that were open so I swapped those HDDs back to the onboard SATA ports and everything booted up just fine. I'm not sure if it's the LSI card or the cables that are the issue, but I have a replacement for both on the way as the cards and cables are cheap. I wanted to have all my drives on my array running in logical order physically inside of my box, but I guess it really doesn't matter all that much as long as the drives are working and UNRAID knows where each one is.

 

In the meantime, I'm running a rebuild on that drive that said it failed. I guess my family will have to live one more day without access to their entertainment. C'est la vie.

 

I used to have Molex>SATA power adapters and everywhere else that I read online told me that was not super safe and to switch to SATA>SATA if I needed the extra SATA power so I'm curious as to why you'd recommend the opposite. That said, in my situation it's a moot point as I now have a modular power supply that fits my current needs and gives me the headroom for growth.

 

I'm going to mark this as resolved. I'll post my steps below for future reference.

*SOLUTION*

Drives were randomly throwing SMART errors and showing up as missing after reboots. The first drive that did that I just swapped out for a formatted replacement and redownloaded everything because at that point I had rebuilt the parity drive.

When another drive failed I suspected it was hardware related. I was running all my array drive and parity through LSI card and used some SATA>SATA extensions for power. I swapped for more reliable SATA power (bigger PSU) and moved problematic HDDs from LSI card back to mobo as I had some open slots. Everything works much more reliable. I'm guessing it wasn't actually PSU related but is an issue with my 4-SATA SFF-8087 Multi-Lane Forward Breakout Internal Cables or the LSI HBA SAS 9207-8i PCI-E 3.0 card. I have replacements on the way and if I have time I'll test them to see if they were at fault.

 

Thanks for the help guys.

 

Edited by barefootwarrior
  • Like 1
Link to comment
10 hours ago, barefootwarrior said:

I used to have Molex>SATA power adapters and everywhere else that I read online told me that was not super safe and to switch to SATA>SATA if I needed the extra SATA power so I'm curious as to why you'd recommend the opposite.

That is only the case if they are badly manufactured - and in my experience this this tends to be more common with SATA->SATA splitters.

 

A point is that the SATA->SATA splitters can rarely handle more than a 1->2 split due to limitations on the current that can flow across the host end SATA connector without getting voltage sag.   With Molex->SATA connectors you can normally use 1->4 splitters without current draw issues as the Molex connector uses much larger pins capable of taking more current.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...