November 15, 20223 yr Hi everyone, I'm ending up asking for some help as my UNRAID system is unstable, and after 3 months of searching through the forum, wasn't able to identify the cause of my problem. Here is some context and specs: System: M/B: ASRock B660M Steel Legend CPU: 12th Gen Intel® Core™ i3-12100 RAM: 2x4GB Crucial DDR4 2133MHz Array: 4x6TB Toshiba NAS HDD (1 parity+3 disks) Cache: WD 1Tb NVMe SSD The current OS version is 6.10.3 Plugins: Community applications, GPU statistics, Intel GPU Top Dockers: Plex Issue: The system crashes after random time according to the following sequence : 1. the WebGUI is no longer accessible (login page not reachable) 2. if I see it early enough, I can still use the keyboard on the NAS to do some command. But powerdown has no effect as it seems to indefinitely loop 3. if I do not see it early, the keyboard and screen connected to the NAS are frozen as well, and no command can be done locally. 4. in any case, a hard powerdown/reset is necessary. I've been following these users who had similar issues: In september, my NAS was taking my all network down, just like this user. I was in 6.11. I downgraded to 6.10 AND also connected the NAS through only 1 GiG port on my routern and no longer the 2.5G port. I don't know which solved what, but my network was fine after that, but UNRAID kept crashing. I found this suggesting the RAM was faulty: I tested mine thoroughly (10 passes with MemTest), and found nothing. Others found the dockers were the culprit with a bad setting with IPs: I changed my docker settings from macvlan to ipvlan. It has been stable for some 15 days, but after that, crashed again. During all this time, I had the log being saved into the USB flashdrive. Here is the full one (66Mb, sorry...): https://drive.google.com/file/d/1747Qm_1qJOaK1x9BwnpWFg7e-eiHGcSE/view?usp=share_link I thought also this could have an issue with the mover, but it seems not, as the crash occurs at random times. I also checked that all array disks were XFS. In the syslog, when the system starts crashing, this loop happen every 3 minutes (starting line 41361): error bloc.txt I tried to troubleshoot with what I could find on "rcu_sched self-detected stall on CPU", but didn't have success. At some point, it seems that the system also executes a memory test, that fails everytime (e.g. line 693 472): mem test fail.txt I really don't know what causes these crashes, and this loop to occur. Hopefully I've been clear enough and you can help out. If you need anymore info or details, please ask. Thanks everyone!
November 17, 20223 yr Author As an update, it crashed again, after one day on. Here is the log (similar and shorter): syslog_15-16NOV.txt
November 17, 20223 yr I'm not great at that kind of things but your log seem to point out to the server running out of RAM and shutting down things. Don't know what it is but this process takes about 5.5 of your 8GB of memory Nov 16 15:22:38 NAS kernel: filp 5727968KB 5727968KB Your diagnostics might provide a bit more context.
November 17, 20223 yr @thibfighterDid you set up plex docker to transcode to RAM? If so, try disabling that first.
November 17, 20223 yr Author 1 hour ago, jfoxwu said: @thibfighterDid you set up plex docker to transcode to RAM? If so, try disabling that first. I'll try and do that. I also have one 16GB stick of ECC RAM coming up (twice my present capacity). My present one is also an old 2133MT/s, so maybe it's not very suited for 12th gen intel Edited November 17, 20223 yr by thibfighter
November 23, 20223 yr Author Some feedback here, so far for the past week, the system ran flawlessly. It seems that the transcode to RAM was the culprit. I'll keep this thread open if it happens to crash down again.
November 27, 20223 yr Author NEVERMIND 😛 Crashed again after 10 days. "Funnily" enough, I've had new lines of errors saying my flash drive is blacklisted: Nov 23 21:23:42 NAS emhttpd: Unregistered - flash device blacklisted (EBLACKLISTED2) Nov 23 21:23:42 NAS kernel: traps: udevadm[11688] general protection fault ip:149a422552f4 sp:7fff93c82b70 error:0 in ld-2.33.so[149a42248000+25000] Nov 23 21:23:43 NAS emhttpd: Basic key detected, GUID: 0781-5583-0001-200628116535 FILE: /boot/config/Basic.key Of what I found on the forum, this happened to some people after updating Unraid, and sometimes windows repairs works. I did not do any kind of update and windows did not find any errors in my case... In the end, I still have the "general protection fault" (line Nov 26 03:03:39 in the syslog), which locks my system: I can access the files on the NAS and use plex but local command to shutdown is impossible. Hard shutdown was necessary. Thank you for helping again! syslog.txt
November 30, 20223 yr have you tried changing your usb flash drive? I had something similar like this before and found out my flash drive was dying.
December 4, 20223 yr Author On 11/30/2022 at 3:15 PM, jfoxwu said: have you tried changing your usb flash drive? I had something similar like this before and found out my flash drive was dying. I guess I'll try that if I have more of the same errors, but it's quite a shame as it has been in use for only 3/4 months....
January 10, 20233 yr Author Solution Hi all ! Problem is solved: in the end, the RAM I had originally installed was causing all the errors. The original RAM was an "old" 2x4Go 2133MHz. I guess the timing was incompatible with Intel 12th Gen CPUs, because all MEM tests I ran returned errorless. I ordered a new 16Go stick, 3200MHz, and the server has been running flawlessly for 1 continuous month. I never had this stability before! Thanks again to all who guided my researches!
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.