vidkun Posted December 18, 2019 Share Posted December 18, 2019 I have a new build with all new parts. Here's the list: https://pcpartpicker.com/list/3psBq3 Only thing different is the HDDs are shucked 12TB easystores instead of reds. I started on the last version of 6.7 (can't remember the third number) and was having issues where the entire system would lockup after many hours. Some days it'd only last a handful of hours, other times it'd go a couple days. When it locks up: the webUI is unresponsive and won't load, any running containers or VMs are dead and unresponsive, all shares are unreachable, all network traffic to the unraid server appears dead as well (no SSH, no ping replies, etc). The only way to recover is for me to hold the power button and do a hard reset. Once the server comes back up, everything appears just fine again until it crashes again. Syslogs don't appear to provide any indication of something happening around the time of the crashes. I've uploaded diagnostics from right after the last two occurrences and a zipped syslog that's been archived to the flash through both of them. I've run Fix Common Problems with it not finding anything useful. I've tried letting it run with zero containers or VMs runnings and it still crashes. I believe that there were long periods of writes to the array (both with and without parity) before the crashes. Some were during an initial bulk load/transfer from another NAS and others are during the recent attempts at a Time Machine backup to a share that has yet to complete since it keeps crashing. I leave the webui on the dashboard page so when it crashes I can see the temps and CPU/RAM usage right before it dies. Temps are all in good ranges, RAM usage is minimal, and CPU usage is all over but never too high (max at 80%, but usually much lower). I thought it might be a hardware support issue so I tried upgrading to 6.8, but that hasn't helped. Still getting the crashes. I've also checked that all cables are seated properly. Any help would be greatly appreciated! syslog.zip odin-diagnostics-20191217-1556.zip odin-diagnostics-20191217-0513.zip Quote Link to comment
trurl Posted December 18, 2019 Share Posted December 18, 2019 Have you done memtest (on the boot menu)? Quote Link to comment
vidkun Posted December 18, 2019 Author Share Posted December 18, 2019 53 minutes ago, trurl said: Have you done memtest (on the boot menu)? I have not. For some reason I can't seem to get that to boot. When I choose it on the boot menu, it appears to just reboot the machine. The BIOS POST screen comes back up and it boots right back to the boot menu. I'm going to work on trying to get another USB drive with a copy of memtest created and see if that'll work. In the meantime, I've booted unraid in safe mode to see if that helps any. Also, I had previously disabled CPU C states in BIOS based on another thread I had come across with similar issues. Clearly that has not helped at all here. Quote Link to comment
Frank1940 Posted December 18, 2019 Share Posted December 18, 2019 9 hours ago, vidkun said: have not. For some reason I can't seem to get that to boot. When I choose it on the boot menu, it appears to just reboot the machine. The BIOS POST screen comes back up and it boots right back to the boot menu. I'm going to work on trying to get another USB drive with a copy of memtest created and see if that'll work. In the back of my mind, there is a memory that says that memtst (at least, the version in Unraid) does not work if the server is booting using UEFI. Quote Link to comment
trurl Posted December 18, 2019 Share Posted December 18, 2019 11 hours ago, vidkun said: In the meantime, I've booted unraid in safe mode to see if that helps any. I see you have Nerdpack plugin. Which packages from that did you install? Quote Link to comment
vidkun Posted December 19, 2019 Author Share Posted December 19, 2019 Well, safe mode didn't save me. It lasted about a day in safe mode before locking up again. Memtest issue was UEFI related. Got that running and completed 4 passes with 0 errors. As for nerdpack, the only thing I installed that wasn't a dependency for one of the other plugins was screen. So that brings it to: screen, perl, utemper, bluez, git, and nodejs. Quote Link to comment
vidkun Posted January 1, 2020 Author Share Posted January 1, 2020 Anyone have any other ideas what to check to figure out why this keeps locking up? Quote Link to comment
vidkun Posted January 2, 2020 Author Share Posted January 2, 2020 Looks like there might be some compatibility issues with kernel 4.19 and the 9th gen Intel procs. I haven't found any specifics yet, but have seen a number of mentions to it throughout the forums. Since I'm currently on the 6.8 stable release, is it possible to use the USB creator to write the 6.8 rc7 image to my flash and boot back up to what I had? Or is that going to overwrite all my settings, etc? Is there a way I can get my current instance downgraded to rc7 so I can test the 5.x kernel support? Thanks. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.