chindocaine Posted May 1, 2017 Share Posted May 1, 2017 Hi all, I've been using unRAID for about a year now, but since a few months ago I have a problem. Sometimes the server didn't respond over network. When I tried to use the local console, there was still the login prompt on screen, but the machine didn't respond to any keyboard input. Most of the time, this error occurred at night, so I thought it was when the mover kicked in. Maybe there was a problem with the drives (which weren't new). So this week I replaced the parity and storage drives with brand new ones, as well as the cache drive (I wanted to upgrade to an SSD anyway). For good measure I also replaced the power supply (which was a crappy old one before). To make sure there is no problem with the RAM, I ran Memtest86 for about a day without any errors. I set up unRAID completely from scratch. This morning i couldn't access the server again. Of course it froze again. No networking, no local console. But this time the screen looked quite interesting (see attached photo). Unfortunately I couldn't get any log files since I wasn't able to log in before rebooting. There were no VMs or Docker containers running. Does anyone have any idea what could cause this behavior? Could it be a problem with the boot flash drive? That's pretty much the only thing I haven't replaced or thoroughly tested yet (except for mainboard and CPU, which are still relatively new). Let me know if you need any more data. Thanks in advance! Michael Link to comment
JonathanM Posted May 1, 2017 Share Posted May 1, 2017 Since you say memtest passed an extended test, about the only thing it could be is motherboard or cpu. That corrupted screen means the video RAM doesn't contain sensible data any more. Normally when you see something like that, it means bad video RAM, which is system RAM on a board with integrated video. Link to comment
chindocaine Posted May 1, 2017 Author Share Posted May 1, 2017 Thanks for the reply, the RAM was also one of my guesses, but it's odd since it passed memtest. But yes, it uses integrated graphics. The motherboard is a Gigabyte B85M-D2V, the CPU is an Intel Celeron G1840. The server just crashed again during parity check. I suppose the crash tonight also occurred during parity check, which is scheduled for the first day of month. I guess I'll try replacing the RAM anyway and see if it helps. I have some other DDR3 modules in another system. Unfortunately I don't have any other compatible mainboards or CPUs here to find out if it's one of them. I'd have to buy a new mainboard and CPU, which would suck. Link to comment
Frank1940 Posted May 1, 2017 Share Posted May 1, 2017 Also check to see that all of the fans are working, the intake ports and cooling fins on the heat sinks are clean, and the inside of case is reasonably free of dust and dirt. I would certainly check this before embarking on a parts swapping spree... Link to comment
chindocaine Posted May 1, 2017 Author Share Posted May 1, 2017 Thanks for the advice Frank, but my system is spotlessly clean and all the fans are running... I just swapped out the RAM and started a new parity check. I'll let you know how it turns out... Link to comment
Frank1940 Posted May 1, 2017 Share Posted May 1, 2017 1 hour ago, chindocaine said: Unfortunately I don't have any other compatible mainboards or CPUs here to find out if it's one of them. I'd have to buy a new mainboard and CPU, which would suck. Before you go that route, if you have an extra PS, you might want to try that first. This does not fit the pattern of PS problems but it is easy to quicker and easier to try than a CPU and/or Mb swaps... Link to comment
chindocaine Posted May 1, 2017 Author Share Posted May 1, 2017 Ok, so the parity sync finished successfully with the other RAM. No more crashes so far. Let's hope it stays that way... 4 hours ago, Frank1940 said: Before you go that route, if you have an extra PS, you might want to try that first. This does not fit the pattern of PS problems but it is easy to quicker and easier to try than a CPU and/or Mb swaps... I don't think the power supply is the culprit, because as mentioned in my first post I changed it a few days ago for a brand new one. I hope it was the RAM. Link to comment
chindocaine Posted May 26, 2017 Author Share Posted May 26, 2017 Hi, quick status update: No more crashes since I changed the RAM, so this seems to have been the problem. I still don't get why, since Memtest ran without any errors, but I'm happy that my system is stable now. Thanks again for all your answers! Link to comment
Recommended Posts
Archived
This topic is now archived and is closed to further replies.