Smart health failure = imminent disk failure?


Recommended Posts

I'm ordering 2 new 16tb. I have 2x 8tb in the array that are 4 years old (at least) seagate archive drive that where on 24/7 before it was put in that server. These are slow drive that are suppose to be used as cold storage. I'll probably put them in mirror and have them as cold backup, once a month sync.

 

It may explain the weird thing that are happening with my unraid server that nobody ever found the cause, although it should be from ram since it's in the OS, but I don't know.

 

Just need to find a sata external enclosure now so they can be better cooled.

Link to comment
11 hours ago, Nodiaque said:

It may explain the weird thing that are happening with my unraid server that nobody ever found the cause, although it should be from ram since it's in the OS, but I don't know.

Don't know what you are referring to, but if you suspect a RAM problem you shouldn't even be running your server until you verify RAM is OK. Everything goes through RAM, the OS and any application code, your data, everything. The CPU can't do anything with anything until it is loaded into RAM.

 

Have you done memtest?

Link to comment

I have done memtest. You can check in my post history, I have 3 other thread with investigation that lead to nothing. Ram was tested with memtest86+ latest version as of last week, no error. GPU was also swapped

 

Problem I have, yes a reboot fixed it but when you're away, it's not the best thing:

- ini file in /usr/local/emhttp/state disapear out of nowhere, making the webgui unworking

- docker tab not working, dashboard loading without docker info. Had to force shutdown with power button cause even in putty, couldn't make a reboot

- Unraid stop working, webgui not responsive, all docker not working, cannot reboot from shell (have to force shutdown from power)

 

smart health is still only at 10% after more then 12 hours.... I guess it will fail

Edited by Nodiaque
Link to comment

Flash drive problems can lead to UI problems. Another thing that can happen is filling rootfs somehow. rootfs is the RAM the OS is in. If you fill rootfs the OS has no space to work with its own files and all sorts of odd things can happen. A common reason for filling rootfs is a docker mapping to some host path that isn't actual storage.

 

I didn't notice either of those in your diagnostics.

 

Do the problems begin soon after booting, or does it run OK for a while?

 

You can see how much of rootfs is used in the df output. This is in diagnostics, you can get the same results with this command line:

df -h

 

Link to comment

It happen randomly. rootfs when I checked the last time was at the same % as right now, so not full. First time it took about 3 months, then it happen in 24 hour. It then run fine for a while before another problem appear, then about a month later got the last bug I had with ini file. We though maybe it was backup plugin since it seems to start when it ended, but the problem reproduce 12 hours after the last crash which was far from the backup schedule. Because of that, I haven't tried to upgrade to 6.10.x, want to be sure my base is stable before.

 

I was just wondering, how can the flash drive be a problematic if it's not used once it's booted (since everything is in ram)?

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.