lishpy Posted February 10, 2017 Share Posted February 10, 2017 Hello, Two nights ago I powered down my server to remove a DIMM. I booted it back up and was in the webGUI no problem, and then I updated from 6.3 to 6.3.1. From there I started the array and went to go to sleep, but before I noticed a Pushover alert that one of my drives was reporting as hot. I couldn't load the webGUI but I still had SSH access, so I turned the server off and figured I'd deal with it later. Woke up the next morning and the server never shut down so I had to shut it off using the power button. Upon reboot, of course a parity check started, but again, no webGUI access. I got several alerts from Pushover about drive temperature again and Common Problems alerting me there was a problem with my server. The parity check finished but still no webGUI access. I've attached some diagnostics to hopefully get some light involved as to why the webUI isn't loading. Unfortunately this morning is my scheduled monthly parity check, so that's currently running on the server right now. I do still have share access and SSH access at this moment. Any assistance would be greatly appreciated. tower-diagnostics-20170209-2237.zip Quote Link to comment
John_M Posted February 11, 2017 Share Posted February 11, 2017 There's a lot of bad stuff going on in your syslog that might be related to a memory problem. I suggest running Memtest from the boot menu for a good long time. Why did you remove a DIMM - did you suspect some problem? Quote Link to comment
lishpy Posted February 11, 2017 Author Share Posted February 11, 2017 Thanks for the response. I removed a DIMM because it was reporting as 4GB but it's a 8GB DIMM. System was functioning fine for months before but was saying 28GB instead of 32GB installed the whole time. Maybe I removed the wrong DIMM? EDIT: I've installed the DIMM I removed and booted to Memtest, will run it over night and report back. Memtest is showing 32GB as of now. Quote Link to comment
Frank1940 Posted February 11, 2017 Share Posted February 11, 2017 If it were I, when memtst finishes, I would power down the server. I would first visually inspect all DIMM's (using a good LED flashlight helps in this) to see that none are tilted and then push on all of the memory DIMM's to make absolutely certain they are securely seated in their slots. Quote Link to comment
lishpy Posted February 11, 2017 Author Share Posted February 11, 2017 They're all secured I made sure. 15 minutes into the Memtest there were 3500 errors. Quote Link to comment
John_M Posted February 11, 2017 Share Posted February 11, 2017 Well, that's a result! Maybe not the result you were hoping for, but a clear indication of what you need to do next. Quote Link to comment
lishpy Posted February 11, 2017 Author Share Posted February 11, 2017 Well, ended up having to remove 2 sticks of RAM. I narrowed down to having two sticks not erring, one I knew for sure was (100000+ by this morning) and the last one I put back in ended up having 8 errors detected after an hour or so. I can live with only having 16GB of RAM now. I can't recall where I got this RAM so no idea where to warranty it. It's got Crucial stickers on it but they're covered with some other warranty stickers that aren't coming up on the Crucial site, so it seems I bought them from some reseller or something. Anyway, the server is back online and the webGUI loads just fine now. Thanks for the assistance everyone. Quote Link to comment
Frank1940 Posted February 11, 2017 Share Posted February 11, 2017 Well, ended up having to remove 2 sticks of RAM. I narrowed down to having two sticks not erring, one I knew for sure was (100000+ by this morning) and the last one I put back in ended up having 8 errors detected after an hour or so. I can live with only having 16GB of RAM now. I can't recall where I got this RAM so no idea where to warranty it. It's got Crucial stickers on it but they're covered with some other warranty stickers that aren't coming up on the Crucial site, so it seems I bought them from some reseller or something. Anyway, the server is back online and the webGUI loads just fine now. Thanks for the assistance everyone. Be a bit careful when ordering four sticks of RAM for use in a MB. I understand that those sticks with chips on both sides of the sticks can be difficult for the address line drivers to drive and this can cause memory errors. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.