[SOLVED] Unraid 6.8.0 | Kernel Panics & System Freezes


vorel

Recommended Posts

Hi all,

 

I've been troubleshooting some random freezes that I cannot replicate and I would appreciate any other advice on what to do next.

 

Background

  • These random freezes started happening while I was on vacation (meaning that I didn't make any changes to the system when it started happening)
  • These freezes started on 6.7.2
  • I wasn't too concerned yet because I knew that 6.8.0 was around the corner and I was doing some major holiday hardware upgrades soon

 

What I upgraded since crashes started happening

  • I upgraded from 6.7.2 → 6.8.0
  • I upgraded & replaced all of my data & cache drives
  • I upgraded my processor from AMD Ryzen 1600 to AMD Ryzen 2700X
  • I replaced my Unraid flash drive (just because the previous one was getting old)

 

What has stayed the same

 

Tests I have run (but still have random freezes)

  • Disabling VMs and Docker (through the options on Web GUI)
  • MemTest (passed -- see attached)
  • Upgraded BIOS (to latest recommended for my CPU version by ASROCK)
  • Resetting all BIOS settings
  • Enabled/disabled C6 states (for AMD Ryzen processors)

 

What's strange

  • I cannot replicate the issue
  • This all started happening when I was not making any hardware or software changes
  • Sometimes there are no errors at the time of the freeze, other times I get a lot of messages
  • The RIP error values are not always the same (see my last syslog error and compare it to my actual attached console errors)

 

The latest error that I received in my syslog is:

kernel: RIP: 0010:get_page_from_freelist+0x252/0xd0b

Attached are my diagnostics and my syslog (been recording it to the flash drive ever since I upgraded my processor). I would greatly appreciate any other perspectives on what to try next. Thanks in advance!

 

ConsoleErrors-1.png

ConsoleErrors-2.png

ConsoleErrors-3.png

MemoryTest-Results.png

syslog zeppelin-diagnostics-20191223-0842.zip

Edited by vorel
Solved the issue
Link to comment
  • 6 months later...

It ended up being a memory issue with my server.

 

I was running 4 DIMMs. 2 of them I purchased in 2017. I purchased 2 more in 2018. It ran fine for a year until I had the issues.

 

Strange thing was, testing each stick individually with Memtest would work. If I put all four DIMMs in there, it would fail.

 

I contacted my motherboard manufacturer (ASRock) and they said even though I had the EXACT same part number for memory in all four slots, they said it’s because I didn’t buy the memory all together is why I had the problem.

 

I ordered new memory and my server hasn’t crashed at all.

 

Update your BIOS and try different memory if you have it.

 

Good luck! Hope this helps you.

Link to comment
3 minutes ago, vorel said:

It ended up being a memory issue with my server.

 

I was running 4 DIMMs. 2 of them I purchased in 2017. I purchased 2 more in 2018. It ran fine for a year until I had the issues.

 

Strange thing was, testing each stick individually with Memtest would work. If I put all four DIMMs in there, it would fail.

 

I contacted my motherboard manufacturer (ASRock) and they said even though I had the EXACT same part number for memory in all four slots, they said it’s because I didn’t buy the memory all together is why I had the problem.

 

I ordered new memory and my server hasn’t crashed at all.

 

Update your BIOS and try different memory if you have it.

 

Good luck! Hope this helps you.

 bios is updated to latest and memory was bought at the same time. 4x16gb sticks. the most ive had it run was about 6 hrs before it completely hanged and needed reboot. ive done all the tweaks to bios and config files but still cant get away from it. ill try removing the sticks and see where that gets me.

Link to comment

You can try one stick at a time to see if that exposes any clues. What I learned was when I tested one stick at a time, Memtest would pass.

 

It wasn't until I was running all four sticks is where I had an issue.

 

I was very skeptical when ASRock told me to get new memory, but it hasn't crashed since my original post.

Link to comment
1 minute ago, vorel said:

You can try one stick at a time to see if that exposes any clues. What I learned was when I tested one stick at a time, Memtest would pass.

 

It wasn't until I was running all four sticks is where I had an issue.

 

I was very skeptical when ASRock told me to get new memory, but it hasn't crashed since my original post.

 

actually i looked at my mobo qvl and my ram sticks arent listed. so i went ahead and placed an order on specific sticks that work with it and amd. hoping that clears up the issue. thanks for your input. i wouldnt have thought about that lol

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.