July 24, 20241 yr Hi, I had this unraid server since a few month, on 6.12.10, solid like a rock until this : (I will underline bullets point so it is easier to diagonally read) Modified pretty extensively the hardware a week ago : - Motherboard swap (MSI MAG b550) - Same CPU (AMD 5600x) - Adding a RTX 3090 - From 16g to 32g of RAM. - Swapping Power supply to a nicer one, platinum 850W from ASUS. As you maybe guessed, I was in the process of building my own multi-purpose AI box. Booted first time after hardware changes, going strong for about 3 days then the server crashed without any apparent reasons. I forced shutdown the server then rebooted -> All right, maybe it's a glitch, let disable all AI thingy dockers. 1 day later, it crashed again. Reboot, less than one day later -> crash. And again today. I'm in the process of diagnosing the ram with a memtest but I don't believe this is a ram issue. I remember having a similar issue when I had the Unraid server on a AMD 1700 : I had to disable C-states and also adding some code at boot. As it was felling similar, I updated the motherboard BIOS which was 2 versions behind, so no biggy. I'll keep you updated on news on my side. This could help others with similar issue as well. Thanks for reading. syslog-192.168.20.125.log exposed-diagnostics-20240721-1858(1).zip Edited July 24, 20241 yr by A.sch3
July 24, 20241 yr Community Expert Was this when it last crashed? Jul 23 04:45:43 Exposed kernel: ---[ end trace 0000000000000000 ]--- Jul 23 04:45:43 Exposed kernel: BUG: unable to handle page fault for address: 0000000002d5a768 Jul 23 04:45:43 Exposed kernel: #PF: supervisor write access in kernel mode Jul 23 04:45:43 Exposed kernel: #PF: error_code(0x0002) - not-present page Also make sure this has been taken care of: https://forums.unraid.net/topic/46802-faq-for-unraid-v6/?do=findComment&comment=819173
July 24, 20241 yr Author 47 minutes ago, JorgeB said: Also make sure this has been taken care of: https://forums.unraid.net/topic/46802-faq-for-unraid-v6/?do=findComment&comment=819173 Thanks for the reply ! My ram is running at default speed 2133Mhz I believe, I haven't enable SMT or any "game boost" thing on the mobo. But thanks for the link, I'm not sure about what should I do about my ram frequency. I'll check out the power setting thing next. It seems like I provided the yesterday diagnosis zip file (nothing changed since yesterday but still, here the up to date diag.zip file), but the today syslog.log was the one saved by syslog server on my other unraid server. The memtest finished without errors by the way. exposed-diagnostics-20240724-1832.zip Edited July 24, 20241 yr by A.sch3
July 25, 20241 yr Author Just a heads-up : No crash since yesterday. Which is encouraging I guess. Apart from the BIOS update, I haven't took any measure yet.
July 27, 20241 yr Author Solution My system is still stable after the previous week of random crashing. I guess the BIOS update did the tricks.. Thanks @JorgeB for the additional information. Edited July 27, 20241 yr by A.sch3
July 31, 20241 yr Author On a side-note, the power consumption is now much higher than it was before the BIOS update. I guess the c-states are disable by default. I will try to modify the BIOS setting because I'm anxious about seeing so much current going into this Unraid server that was pretty economical for me so far.
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.