kernel panic


bonzi
Go to solution Solved by JorgeB,

Recommended Posts

3 hours ago, only-university6482 said:

Did you get to the bottom of the kernel panic error? Having the same on mine so turned on syslog yesterday.
I see it passed the memtest so an issue somewhere else?
Thanks

I am waiting for it to crash again, I think I will need that data to understand why this is happening. The memtest was fine, ran o/n no errors.

Link to comment
15 hours ago, bonzi said:

I am waiting for it to crash again, I think I will need that data to understand why this is happening. The memtest was fine, ran o/n no errors.

Mine just crashed 10 mins ago again! I'm at a loss, will do a memory test
I had syslog server enable and just looked at the file - nothing in there. Crashed just before the 11:05 timestamp

Oct 20 10:28:56 tower500 proxmox-backup-proxy[57]: starting rrd data sync
Oct 20 10:28:57 tower500 proxmox-backup-proxy[57]: rrd journal successfully committed (18 files in 0.050 seconds)
Oct 20 11:05:31 tower500 proxmox-backup-api[9]: service is ready
Oct 20 11:05:31 tower500 proxmox-backup-proxy[10]: service is ready

Link to comment
On 10/26/2023 at 6:03 PM, bonzi said:

Ok, I disabled c-states in the bios. I have done a overnight memtest, that passed without any issues at all. Let's see if the c-states fixes the problem.

I had another crash last night, c-states are disabled. Frustratingly there seems to be little of interest in the syslog. The only activity I see is this which I believe is my Coral TPU:

 

Oct 28 23:11:22 Tower kernel: usb 2-2: reset SuperSpeed USB device number 2 using xhci_hcd
Oct 28 23:11:22 Tower kernel: usb 2-2: LPM exit latency is zeroed, disabling LPM.

 

At this point I really do not know what to do, except hope to get more information before the crash.

 

EDIT: I have also posted a screenshot of what is printed on to the screen at the time of the crash.

Screenshot 2023-10-29 at 10.30.36 AM.png

Edited by bonzi
Link to comment

It crashed again, this time I think we might have something more useful. Could it be that the USB ports are causing this and are failing or not drawing power properly for devices that need a lot of power? Especially I am thinking the USB port that the Coral TPU is plugged into? Thoughts? That is what it looks like to me from the log.

 

I attached all of the log from the crash and reboot. @JorgeB if you could have a look at the first few lines and let me know if you think this is what is going on. Thanks!

syslog-crash-2.rtf

Link to comment
On 10/29/2023 at 3:32 PM, JorgeB said:

A lot of call traces, they look more hardware related, try with just RAM stick, if the same try a different one, that will basically rule out RAM.

Ok I will give that a try. Crashes are becoming more frequent now, often after a few hours and less than one day. Memtest does pass but it does seem like a hardware issue.

Link to comment

I have been running for 4 days without a crash with just one single stick of memory. I am going to let it go for at least a few more days then I will try putting in the other sticks and seeing which one is giving problems.

 

Its strange that memtest did not pick this up at all but I will be able to figure this out now I think.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.