Mattaton Posted July 29, 2020 Share Posted July 29, 2020 (edited) In the past week my server has begun to crash randomly. I can't seem to figure out exactly what task or process is triggering this and I'm hoping someone here can help me shed some light on it. I've attached my most recent log. Last crash was on 7/29 and I didn't turn it back on until today. Crashed 3 or 4 times prior to that, but I have no logs for them. Many of the errors at the beginning of the file are because It's trying to write logs to the syslog share (which is on cache), but the cache drive was full from a backup operation. I have since moved the backup operation to write straight to the array so it doesn't fill the cache as I'm trying to gather logs. (Other suggestions there are appreciated - I tried to set up a syslog server on a Windows machine, but was not successful in getting it to work.) You can see in the logs that after Jul 27 17:39:16 I got some BTRFS warnings about files it appears the Mover couldn't move. I think this is because a crash occurred during writing them. I deleted the files from the cache and let the backup run again which got those files onto the array. Not too worried about this. I think the problem is at Jul 27 22:09:05 Jul 27 22:09:05 TyreeMedia kernel: general protection fault: 0000 [#1] SMP PTI Jul 27 22:09:05 TyreeMedia kernel: CPU: 3 PID: 32449 Comm: find Tainted: P O 4.19.107-Unraid #1 Is this a CPU issue??? I've also attached my diagnostics output. Thanks! syslog-192.168.75.12.log tyreemedia-diagnostics-20200729-1756.zip Edited July 29, 2020 by Mattaton Quote Link to comment
Mattaton Posted July 29, 2020 Author Share Posted July 29, 2020 Okay. Looking on the forums for similar errors, I saw that MemTest was something to try. See the attache photo. Ummmm.... I'm gonna say this isn't good, but is it even possible for 4 sticks to COMPLETELY FAIL??? I know pretty much nothing about this stuff, but this seems like too many errors to be reasonable. Since taking the photo, the Pass is at 16% with over 1500 errors and still Pass of 0. Am I looking at buying new RAM or could this be a mobo issue? Quote Link to comment
Michael_P Posted July 29, 2020 Share Posted July 29, 2020 You could try re-seating the RAM, or try the stick throwing errors in another slot Quote Link to comment
Mattaton Posted July 29, 2020 Author Share Posted July 29, 2020 Shuffled all RAM sticks in same slots. Still lighting up like a Christmas tree. How do I know which stick is throwing the errors? Quote Link to comment
Zonediver Posted July 29, 2020 Share Posted July 29, 2020 (edited) 35 minutes ago, Mattaton said: Shuffled all RAM sticks in same slots. Still lighting up like a Christmas tree. How do I know which stick is throwing the errors? Test all RAM-Sticks one by one in the same (working) slot - then you will see which one is faulty. And check that all RAMs are running "not" overclocked - deaktivate XMP! Also possible: A defective RAM-Slot on the Mainboard... not funny but possible... Edited July 29, 2020 by Zonediver Quote Link to comment
trurl Posted July 30, 2020 Share Posted July 30, 2020 49 minutes ago, Zonediver said: Also possible: A defective RAM-Slot on the Mainboard... Which you should also test for by trying that single stick in another slot if the test fails. Quote Link to comment
Michael_P Posted July 30, 2020 Share Posted July 30, 2020 Bent pin in the CPU socket could knock out a bank, too - ask me how I know :) Quote Link to comment
Mattaton Posted July 30, 2020 Author Share Posted July 30, 2020 1 hour ago, Michael_P said: Bent pin in the CPU socket could knock out a bank, too - ask me how I know Yeeeesshhh...fun! I'm hoping that's not the case since it's been working fine for a long time and the CPU hasn't been removed for the pins to be exposed. I'm toying with the idea of just replacing the mobo, CPU, & RAM. This PC was aging when I put it into service as an unRAID server. I think with the extra duties I'm throwing at it with backups from my Windows PCs, I should actually look at some new hardware for it and not start slapping band-aids on this build. Time to start researching how much unRAID likes Ryzen3000/x570. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.