Jump to content

Failure prevention...


cj0r

Recommended Posts

So I just came to the realization that my UnRaid server has been going strong for almost 2 and a half years now.  Full 24/7 operation with at least 7 drives in it from the start... temperatures getting as warm as 95F during hottest summer hours.

 

What is the general consensus for you other users in terms of being on the ball in avoiding catastrophic failures like a dead motherboard etc?  How often do you plan on upgrading, replacing or switching out components?

 

My setup is this:

Corsair 550VX PSU

Abit AB9 Pro motherboard

Celeron 440 CPU

4GB Crucial 5300 DDR2 Memory

Old PCI VGA video card that I never use but is required to be installed

1x IcyDock 5 in 3 Bay Convertor

2x Rosewill PCI-Ex1 RC-213 SATA controller cards

2GB Sandisk Micro Cruzer USB Flash drive

 

I already switched out all my older Hard drives with new 2TB drives last November onward.  There's 8 total at this point.  Everything else is around the same age... PSU, motherboard and flash drive being the oldest.

Link to comment

So I just came to the realization that my UnRaid server has been going strong for almost 2 and a half years now.  Full 24/7 operation with at least 7 drives in it from the start... temperatures getting as warm as 95F during hottest summer hours.

 

What is the general consensus for you other users in terms of being on the ball in avoiding catastrophic failures like a dead motherboard etc?  How often do you plan on upgrading, replacing or switching out components?

 

My setup is this:

Corsair 550VX PSU

Abit AB9 Pro motherboard

Celeron 440 CPU

4GB Crucial 5300 DDR2 Memory

Old PCI VGA video card that I never use but is required to be installed

1x IcyDock 5 in 3 Bay Convertor

2x Rosewill PCI-Ex1 RC-213 SATA controller cards

2GB Sandisk Micro Cruzer USB Flash drive

 

I already switched out all my older Hard drives with new 2TB drives last November onward.  There's 8 total at this point.  Everything else is around the same age... PSU, motherboard and flash drive being the oldest.

Monthly parity checks...

Periodic cleaning of air filters/fans

Monthly SMART tests to detect drives with marginal hardware.

Use of a UPS to keep power outages from affecting server.

 

Other than that... enjoy the server.  My first unRAID server will be going on 5 years old in a few more months.

 

 

Link to comment

Hmm but those tests would only really show hard drive issues, cable issues, connector issues...  I'd be more concerned with sudden hardware failure like a motherboard dying (which has happened to me before in other computers).  The longer your hardware is in use I'd think the higher chance for something like that happening.  The normal maintenance procedures are obviously a given for prolonging life but I'd be more concerned with those unpredicable deaths that really ruin your day... I'd love to focus on preventing them.

 

By the way, I love my freshly bought 1500VA APC unit :).

Link to comment

I'd be more concerned with sudden hardware failure like a motherboard dying (which has happened to me before in other computers). 

Keep a printed listing of your "devices" page.

Keep a copy of your "config" folder on your flash drive.  (and any other folders you may have added "packages", "unmenu", etc)

 

Because unRAID really does not care which hardware you use, you are not subject to the same pains as when a window's PC motherboard dies.

 

Joe L.

 

Link to comment

Keep a printed listing of your "devices" page.

 

Is there a file/text version of the devices page that could be "backed up" also.

 

Problem with printed pages is they are never around when you want them. Whereas if I keep a soft-copy, then I usually have it multiple locations and find it faster/easier that way

 

Link to comment

Keep a printed listing of your "devices" page.

 

Is there a file/text version of the devices page that could be "backed up" also.

 

Problem with printed pages is they are never around when you want them. Whereas if I keep a soft-copy, then I usually have it multiple locations and find it faster/easier that way

 

You can save the html file for the "devices" page from your browser,

or, you can log onto unRAID via telnet or the console and type:

 wget localhost/devices.htm -q -O -  | sed -n "/Disk devices/,/Stopped/p" | grep -v "tr>" | sed "s/<[^<]*>//g" | tee /boot/devices.txt

 

It will put the output in a file named devices.txt on your flash drive.

Link to comment

Well my primary concern with this is down time but as I think of it there shouldn't be any huge issues other than no access to media for a week or so.  Any important documents stored on the server would (should? who knows what my family is doing) be kept elsewhere too so that wouldn't be an issue since the server is in fact a backup.

 

Good call on the PSU tester, I have one but it is pretty outdated at this point so I should probably invest in something newer.

Link to comment

Good call on the PSU tester, I have one but it is pretty outdated at this point so I should probably invest in something newer.

A PSU that is going bad (supply lines losing regulation) will probably not show as bad on a simple tester as you suggested.  It might not even show on a more advanced one that shows actual voltages.  It will show when used with electronics sensitive to voltage and noise on the supply lines (memory CPU disks)

 

The PSU tester will let you know for sure it is a power supply that failed, but will never show one that is causing an occasional error when asked to spin up all your disks, they just do not place that kind of load on it.

 

Joe L.

Link to comment

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...