Jump to content

Server freezes randomly


chindocaine

Recommended Posts

Hi all,

 

I've been using unRAID for about a year now, but since a few months ago I have a problem. Sometimes the server didn't respond over network. When I tried to use the local console, there was still the login prompt on screen, but the machine didn't respond to any keyboard input.

Most of the time, this error occurred at night, so I thought it was when the mover kicked in. Maybe there was a problem with the drives (which weren't new).

So this week I replaced the parity and storage drives with brand new ones, as well as the cache drive (I wanted to upgrade to an SSD anyway). For good measure I also replaced the power supply (which was a crappy old one before). To make sure there is no problem with the RAM, I ran Memtest86 for about a day without any errors. I set up unRAID completely from scratch.

This morning i couldn't access the server again. Of course it froze again. No networking, no local console. But this time the screen looked quite interesting (see attached photo). Unfortunately I couldn't get any log files since I wasn't able to log in before rebooting. There were no VMs or Docker containers running.

Does anyone have any idea what could cause this behavior? Could it be a problem with the boot flash drive? That's pretty much the only thing I haven't replaced or thoroughly tested yet (except for mainboard and CPU, which are still relatively new). Let me know if you need any more data.

Thanks in advance!

 

Michael

unraid.jpg

Link to comment

Since you say memtest passed an extended test, about the only thing it could be is motherboard or cpu. That corrupted screen means the video RAM doesn't contain sensible data any more. Normally when you see something like that, it means bad video RAM, which is system RAM on a board with integrated video.

Link to comment

Thanks for the reply, the RAM was also one of my guesses, but it's odd since it passed memtest. But yes, it uses integrated graphics. The motherboard is a Gigabyte B85M-D2V, the CPU is an Intel Celeron G1840.

The server just crashed again during parity check. I suppose the crash tonight also occurred during parity check, which is scheduled for the first day of month.

I guess I'll try replacing the RAM anyway and see if it helps. I have some other DDR3 modules in another system. Unfortunately I don't have any other compatible mainboards or CPUs here to find out if it's one of them. I'd have to buy a new mainboard and CPU, which would suck.

Link to comment
1 hour ago, chindocaine said:

Unfortunately I don't have any other compatible mainboards or CPUs here to find out if it's one of them. I'd have to buy a new mainboard and CPU, which would suck.

 

Before you go that route, if you have an extra PS, you might want to try that first.  This does not fit the pattern of PS problems but it is easy to quicker and easier to try than a CPU and/or Mb swaps...

Link to comment

Ok, so the parity sync finished successfully with the other RAM. No more crashes so far. Let's hope it stays that way...

4 hours ago, Frank1940 said:

 

Before you go that route, if you have an extra PS, you might want to try that first.  This does not fit the pattern of PS problems but it is easy to quicker and easier to try than a CPU and/or Mb swaps...

I don't think the power supply is the culprit, because as mentioned in my first post I changed it a few days ago for a brand new one. I hope it was the RAM.

Link to comment
  • 4 weeks later...

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...