Jump to content
We're Hiring! Full Stack Developer ×

[SOLVED] Random Freezes... but was able to Get Diagnostics This Time


denzo
Go to solution Solved by JorgeB,

Recommended Posts

I have been experiencing random lockups/freezes for several months. I am always trying out new Dockers and so I just assumed it was related to those but I have not installed (or had running) anything other than the dockers I "need" for a couple months and I continue to get random lockups. Tonight I was able to access the gui (even though some dockers and other things stopped working) and was able to download diagnostics (also included is a syslog after rebooting). I don't know what I am looking at in these logs etc. but see lots of errors. I am hoping a knowledgeable member here can take a look and sort of give me a quick roadmap as what I should look at first, second, third to try and get things stable. Any help will be much appreciated.

nas-diagnostics-20220828-2210.zip

nas-syslog-20220829-0410.zip

Edited by denzo
added syslog
Link to comment
11 hours ago, JorgeB said:
BTRFS error (device nvme0n1p1): block=2530556395520 write time tree block corruption detected

This is usually a sign of bad RAM, there was also some data corruption found before, so start by running memtest.

Thanks for the reply, I am running Memtest and getting plenty of errors (see attached images). Pardon my ignorance but let me ask... are these types of errors solely due to the ram itself or could it be motherboard related? Should I just replace this memory? It looks to me like more than one stick has errors, does memory "go bad" like this in multiple sticks at the same time? What might cause multiple (or singular) failures like this, in other words what should I do to fix this and hopefully have it not happen again in another 2 years?

IMG_20220829_105948_562.jpg

IMG_20220829_141935_484.jpg

Link to comment

It is possible for RAM, CPU and motherboard to go wrong so a failure does not pinpoint the failing item.   My guess is it happens more frequently with RAM but that guess is not based on any hard evidence.
 

 It can sometimes be worth simply reseating the RAM in its motherboard slots in case it has worked slightly loose.  

 

It is possible for each RAM stick to test out fine individually but you still get failures when you have multiple sticks plugged in due to overloading the motherboard memory controller.   Carefully check in your motherboard manual the maximum RAM speeds your motherboard+CPU combination can support and remain stable - it is often lower than the rated speed of the RAM sticks, and can vary according to the number of sticks you have plugged in.

 

Anything other than 0 failures means the system will be unreliable.   In terms of how long to run the test the general answer is at least for a complete pass, and ideally for longer (e.g. overnight) as long as you are getting 0 errors.   No point in continuing a test once you start getting errors reported other than perhaps seeing if it points to a particular RAM sticks/slot.

Link to comment

UPDATE 1: I started testing my memory one stick at a time (in slot 1) 2 sticks passed (zero errors after 1 hour+) and the 2 other sticks (tested separately and repeatedly) would not let my system POST. So I am going to run my system with the two "good" sticks and see how stable everything is. I'll update this thread either way. Thanks for the help so far!
UPDATE 2: It's been over a week and everything has been stable. Once again, thanks for the help figuring this out!

Edited by denzo
  • Like 1
Link to comment
  • denzo changed the title to [SOLVED] Random Freezes... but was able to Get Diagnostics This Time

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...