Jump to content

Kernel Call Trace (unRAID Crash)


cpthook

Recommended Posts

Hello.. 

 

Can someone assist with explaining what this Call Trace means?  I woke up this morning to an unresponsive server; both GUI and TS which caused me to do an unclean shutdown and painful Parity Sync.  Sync came back clean, but this will be the 3rd time in the last week my server crashed.

 

Any help would be appreciated.  See attachment  for screen shot of Call Trace.

TowerI_Failure.PNG.628a36956c85431ef951558489eebeb9.PNG

Link to comment

These aren't easy to figure out, and there's little info evident there.  First thing I'd try is to check for a motherboard BIOS update.

 

Is this ECC RAM?  If not, do a Memtest, although I think you would be better off trying the newer PassMark Memtest.  It's less convenient, you have to download it and create bootable media for it.

 

After that, you may want to try different kernels, go forward and go backward.

Link to comment

Thanks for the response RobJ. 

 

First thing I'd try is to check for a motherboard BIOS update.

 

Ok...  My MOBO is fairly new (SUPERMICRO MBD-X11SSL-F-O) but I will check for updates

 

Is this ECC RAM? 

 

But of course!  Moderators / Developers recommendations insist; however, I'm only utilizing one RAM bank on the MOBO with this 16GB stick: Kingston 16GB DDR4 2133 ECC DIMM.  Would you still recommend a memtest?  Based upon the trace, it looked to me like some type of RAM error so I ordered a duplicate stick which will be here today.

 

After that, you may want to try different kernels, go forward and go backward.

 

Ok you got me!!  At the risk of embarrassment here in this post, I'm assuming you mean try different unRAID versions? Roll back to previous or update to Beta version?  From what little I do know, I assumed the unRAID OS was using the latest Linux Kernel. If I'm way off please correct me!!!

Link to comment

First thing I'd try is to check for a motherboard BIOS update.

Ok...  My MOBO is fairly new (SUPERMICRO MBD-X11SSL-F-O) but I will check for updates

Yours was from late 2015, not terribly old but not new either, and motherboard BIOS updates are more common when the boards are new.  Once they've been out there for awhile, they usually aren't interested in updating them, they would rather you buy their latest.

 

Is this ECC RAM? 

But of course!  Moderators / Developers recommendations insist; however, I'm only utilizing one RAM bank on the MOBO with this 16GB stick: Kingston 16GB DDR4 2133 ECC DIMM.  Would you still recommend a memtest?  Based upon the trace, it looked to me like some type of RAM error so I ordered a duplicate stick which will be here today.

I was too lazy to look the board up.  I don't think there's much point in Memtest, go with johnnie.black's advice instead.

 

After that, you may want to try different kernels, go forward and go backward.

Ok you got me!!  At the risk of embarrassment here in this post, I'm assuming you mean try different unRAID versions? Roll back to previous or update to Beta version?  From what little I do know, I assumed the unRAID OS was using the latest Linux Kernel.

As a NAS, where stability is vital, unRAID stays a little behind the latest, but the beta versions are often close.  Right now, the best version to try is the latest - 6.3.0-rc6.  If it doesn't help, sometimes it's possible that an older release works better on some systems.  I would try 6.1.9.
Link to comment

You can check the event log in the bios or with ipmiview,  see if there are any ecc errors.

 

Thanks for the tip...  See attachment.  I did recently install an NVIDIA GeForce GT 730 card in an attempt to migrate my desktop PC to KVM.  I like the idea that my main PC can be dynamic with specifications by virtualizing it.  I had a feeling it was the GPU may be causing some issues, but I also tried out a SUPERMICRO AOC-SAS2LP-MV8 so I wasn't quite sure.  I have no idea what is triggering these events though.  I'm hoping their just related to shutdown(s) and reboot(s) of the VM or my Box. 

 

Yours was from late 2015, not terribly old but not new either, and motherboard BIOS updates are more common when the boards are new.  Once they've been out there for awhile, they usually aren't interested in updating them, they would rather you buy their latest.

 

Ok..  which is probably my case.  According to the SuperMicro support site, I'm running the latest BIOS and IPMI versions.

 

Right now, the best version to try is the latest - 6.3.0-rc6

 

I will give it a shot..  right now I'm on the latest stable 6.2.4.

ipmi_eventlog.txt

Link to comment

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...