Jump to content

Random crashes, system is unstable, not sure where to start looking


Seth_J

Recommended Posts

I've had on and off success with Unraid since I started using it a few months back. Originally I used a very old HPE DL380 G7 but since moved to a quiet custom PC with modern parts and pieces. The 380 was not very stable either but I figured it was the used hardware/age of things. It was also loud. Very loud. 

 

Moving to the new gear Unraid crashes almost every night. No changes in bios outside of getting it to boot to flash and memtest86 passes on all tests but 13 which (some) forum posts say doesn't matter since its not ECC but I'm no expert. If thats it I'll order more memory I guess. 

 

I set the syslog to the flash drive and when I woke up Unraid was crashed. I think there's also a diagnostic that looks like it was saved around the same time? Not sure but I'm attaching both. I cant make heads or tails out of the syslog. Error looks like it happens at 8/31 0045.

 

Any help or direction to look would be greatly appreciated. 

bruno-diagnostics-20220829-2124.zip syslog.zip

Link to comment
4 hours ago, Seth_J said:

which (some) forum posts say doesn't matter since its not ECC but I'm no expert.

Other way around, regular memtest won't find errors for ECC RAM.

 

Some strange crashing going on, since it's an Alder Lake platform update to v6.11.0-rc4, newer kernel might help if it's some compatibility issue.

Link to comment

I can only say this made things significantly worse. Docker would not run/install an app without faulting the system somehow. There's no way to reboot -- just power cycle the machine. I ended up running across all the 915i posts and checking some of the settings for bios. Disabled the shared iGPU and turned off vt-d and c states. The system locks up and diagnostics won't even work. 

 

Rolled back to to 6.10.3 but kept the bios settings. Its crashed during the parity check. I'm watching the monitor output -- helpless because I cant login even locally or do anything. I'll pull the usb and grab the syslog and last diagnostic that I could get to run. 

 

I give up. I ordered new memory of a different brand to see if thats the issue after all. Never had an issue with Crucial but there's always a first time for everyhing.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...