Mashmellow Posted June 24, 2021 Share Posted June 24, 2021 (edited) Hello, I built a new server fairly recently, and it crashes pretty much daily. When I say "crash", I mean it becomes completely unresponsive, including WebUI, and the console shown when you connect a video output. It does this with some degree of consistency overnight, however it is sometimes during the day. Doesn't seem to be caused by anything scheduled. Upon pulling syslogs from the USB stick, I found a docker log indicating that the filesystem had gone into read-only mode, and I found in a thread on here that sometimes NFS can cause this. Disabled NFS, no dice. I have tried the following: - Different OS (switched to UnRaid from Ubuntu) - Different CPU (Switched from old crappy AMD CPU to new slightly less crappy AMD CPU) - Turned off NFS since I read in a thread that it causes issues & can send the server into read-only mode I am completely lost on what to do here. I have no clue where to start, so I will just attach logs as people request. I'm also fairly new to Linux and Ubuntu, so there might be something simple I am overlooking. Might it be a power supply issue? Just an issue with my wall power? Motherboard? Edited June 24, 2021 by Mashmellow Quote Link to comment
JonathanM Posted June 24, 2021 Share Posted June 24, 2021 Are you running your RAM to the motherboard and CPU spec? The speed the memory is rated to run is rarely supported by the CPU in modern AMD systems. Have you run at least 24 hours of memtest with zero errors? Quote Link to comment
Mashmellow Posted June 25, 2021 Author Share Posted June 25, 2021 (edited) This issue has persisted across a memory change (went from 16gb to 32gb, completely new kit, all old memory removed). Haven't run a memtest yet. Also, I just installed NetData last night, and the logs it has show a massive spike in drive usage for a very short period, then it stabilized, then it crashed. I am not home at the moment, but I will run a memtest for 24 hours when I get the chance to. I have not changed anything regarding memory settings. Edited June 25, 2021 by Mashmellow Add more info Quote Link to comment
ChatNoir Posted June 25, 2021 Share Posted June 25, 2021 Your diagnostics would give a better understanding of your system and help providing appropriate guidance. Go to Tools / Diagnostics and attach the full zip to your next post. Quote Link to comment
Mashmellow Posted July 8, 2021 Author Share Posted July 8, 2021 Sorry for the highly delayed response, I have attached a diagnostic log here. Also, it may be helpful to document that I moved the system into a new case, and all power & other connections had to be reseated. apollo-diagnostics-20210707-2325.zip Quote Link to comment
Mashmellow Posted July 8, 2021 Author Share Posted July 8, 2021 I also have syslog set to go to my old Synology NAS, and it just abruptly stops. I can include that here as well if it will help. Quote Link to comment
JorgeB Posted July 8, 2021 Share Posted July 8, 2021 2 hours ago, Mashmellow said: and it just abruptly stops. If there's nothing logged about the crash it suggests a hardware issue, you can try running the server in safe mode without dockers/VMs for a few days, if it crashes like that it's likely a hardware problem, if it doesn't start turning the services on one by one. Quote Link to comment
Mashmellow Posted July 8, 2021 Author Share Posted July 8, 2021 Ok. That's sorta what I expected. Not even going to bother with safe mode, as it has done the same thing across 3 different operating systems, including Windows. Motherboard and CPU are going to be replaced, as they are both slow and crappy anyways, any suggestions on budget options? Quote Link to comment
kizer Posted July 8, 2021 Share Posted July 8, 2021 As well, I had an old Power Supply that was a little wonky and dieing I guess. I replaced it and all has been well for the past 5 or so years. 😃 Quote Link to comment
Mashmellow Posted July 8, 2021 Author Share Posted July 8, 2021 Brand new EVGA Silver certified PSU. Unlikely to be the problem. Quote Link to comment
dalben Posted July 9, 2021 Share Posted July 9, 2021 Look at any plugins you use and see if there was an update to any around the same time as the problems started. My server started experiencing lockups last weekend. I had 3 in 4 days. Nothing at all in the syslog, server just halted. I removed a recently updated plugin and it's been stable for 2 days. I'll let it go a bit longer before claiming I found the issue. As there has been no other reports of random lockups and it's a very widely used plugin, I'm not confident to name it until I am sure. if it's been doing it from day 1 then maybe it is hardware related. My previous build would lock up at odd times. Sometimes nothing for two weeks then a few in succession, very random times and frequency. Nothing in the syslog before the lockup. It had been going on for a year or so. I finally swapped out the motherboard/CPU and RAM and replaced them with components I had in my main PC. Net result my server stopped random lockups and was stable until recently. My PC, with the old server internals, hasn't missed a beat either. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.