Jump to content

New build keeps freezing up requiring hard reset


vidkun

Recommended Posts

I have a new build with all new parts. Here's the list: https://pcpartpicker.com/list/3psBq3

Only thing different is the HDDs are shucked 12TB easystores instead of reds.

 

I started on the last version of 6.7 (can't remember the third number) and was having issues where the entire system would lockup after many hours. Some days it'd only last a handful of hours, other times it'd go a couple days. When it locks up: the webUI is unresponsive and won't load, any running containers or VMs are dead and unresponsive, all shares are unreachable, all network traffic to the unraid server appears dead as well (no SSH, no ping replies, etc).

 

The only way to recover is for me to hold the power button and do a hard reset. Once the server comes back up, everything appears just fine again until it crashes again. Syslogs don't appear to provide any indication of something happening around the time of the crashes. I've uploaded diagnostics from right after the last two occurrences and a zipped syslog that's been archived to the flash through both of them.

 

I've run Fix Common Problems with it not finding anything useful. I've tried letting it run with zero containers or VMs runnings and it still crashes.  I believe that there were long periods of writes to the array (both with and without parity) before the crashes. Some were during an initial bulk load/transfer from another NAS and others are during the recent attempts at a Time Machine backup to a share that has yet to complete since it keeps crashing. 

 

I leave the webui on the dashboard page so when it crashes I can see the temps and CPU/RAM usage right before it dies. Temps are all in good ranges, RAM usage is minimal, and CPU usage is all over but never too high (max at 80%, but usually much lower).

 

I thought it might be a hardware support issue so I tried upgrading to 6.8, but that hasn't helped. Still getting the crashes. I've also checked that all cables are seated properly.

 

Any help would be greatly appreciated!

syslog.zip odin-diagnostics-20191217-1556.zip odin-diagnostics-20191217-0513.zip

Link to comment
53 minutes ago, trurl said:

Have you done memtest (on the boot menu)?

I have not. For some reason I can't seem to get that to boot. When I choose it on the boot menu, it appears to just reboot the machine. The BIOS POST screen comes back up and it boots right back to the boot menu. I'm going to work on trying to get another USB drive with a copy of memtest created and see if that'll work.

 

In the meantime, I've booted unraid in safe mode to see if that helps any. 

 

Also, I had previously disabled CPU C states in BIOS based on another thread I had come across with similar issues. Clearly that has not helped at all here.

Link to comment
9 hours ago, vidkun said:

have not. For some reason I can't seem to get that to boot. When I choose it on the boot menu, it appears to just reboot the machine. The BIOS POST screen comes back up and it boots right back to the boot menu. I'm going to work on trying to get another USB drive with a copy of memtest created and see if that'll work.

In the back of my mind, there is a memory that says that memtst (at least, the version in Unraid) does not work if the server is booting using UEFI.

Link to comment

Well, safe mode didn't save me. It lasted about a day in safe mode before locking up again.

 

Memtest issue was UEFI related. Got that running and completed 4 passes with 0 errors.

 

As for nerdpack, the only thing I installed that wasn't a dependency for one of the other plugins was screen. So that brings it to: screen, perl, utemper, bluez, git, and nodejs.

Link to comment
  • 2 weeks later...

Looks like there might be some compatibility issues with kernel 4.19 and the 9th gen Intel procs. I haven't found any specifics yet, but have seen a number of mentions to it throughout the forums. 

 

Since I'm currently on the 6.8 stable release, is it possible to use the USB creator to write the 6.8 rc7 image to my flash and boot back up to what I had? Or is that going to overwrite all my settings, etc? Is there a way I can get my current instance downgraded to rc7 so I can test the 5.x kernel support?

 

Thanks.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...