RevelRob Posted March 11 Share Posted March 11 Hi, I posted this Saturday and thought all was good, but I've been experiencing additional issues that are odd: I am in the process of preclearing a new replacement disk for the one that is failing. I've turned VMs and Docker off so that there are no additional resources being used elsewhere and to not stress out the parity drive that would e emulating the disabled disk. The problem is, after a period of time, the server becomes unresponsive with no gui loading nor am I ale to SSH in. Even sending an orderly shutdown via IPMI fails. So I have to power cycle via IPMI. It also runs extremely slowly now even after a reboot. I was on an older version (6.12.6) and updated to 6.12.8 to see if that was the issue. I ran a smart test on the new drive to see if it was the issue but it came back no errors. Can anyone tell me idf they see something awry from the diagnostics file I pulled just now after a restart and pre-clear start? It also took the diagnostics almost 20 minutes to generate. Thanks! tower-diagnostics-20240311-1035.zip Quote Link to comment
JorgeB Posted March 11 Share Posted March 11 Mar 11 10:00:09 Tower mcelog: Fallback Socket memory error count 27350 exceeded threshold: 219209113 in 24h Mar 11 10:00:09 Tower mcelog: Location SOCKET:1 CHANNEL:? DIMM:? [] Mar 11 10:00:09 Tower mcelog: Running trigger `socket-memory-error-trigger' (reporter: sockdb_fallback) Mar 11 10:00:09 Tower mcelog: Fallback Socket memory error count 30236 exceeded threshold: 219239350 in 24h Mar 11 10:00:09 Tower mcelog: Location SOCKET:1 CHANNEL:? DIMM:? [] Mar 11 10:00:09 Tower mcelog: Running trigger `socket-memory-error-trigger' (reporter: sockdb_fallback) Mar 11 10:00:09 Tower mcelog: Fallback Socket memory error count 30236 exceeded threshold: 219269587 in 24h Mar 11 10:00:09 Tower mcelog: Location SOCKET:1 CHANNEL:? DIMM:? [] Looks like the server may be having RAM issues Quote Link to comment
RevelRob Posted March 11 Author Share Posted March 11 Thanks for the quick reply! Ah, that's unfortunate. I'm going to try to reseat them all to see if that helps. I also have more RAM than I need for this machine so I could remove any that have failed. Also, the CPU utilization is screaming at 100% even though VMs and Docker is disabled for now. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.