Nomar1245 Posted August 10, 2023 Share Posted August 10, 2023 Over the last few weeks I've been having increasing problems with my server. First it would lock up entirely and require a hard shutdown and reboot. It happened a few times and initial testing had me believe the memory was starting to fail. I replaced it but then my docker containers would start to freeze. I mostly use Emby and was starting to see read-only permissions errors, so I deleted and recreated my docker image. The problem is persisting though, and I have no idea why. The only thing I think I'm noticing is that it freezes while parity check is running, which I know shouldn't be a real problem as I've run it on the monthly schedule for years now without any interruption other than mild performance issues. I was in tstark-diagnostics-20230810-1645.ziphe process of preparing it to a move into a new system, but can't seem to nail down the problem. My diagnostic logs are attached, and my system is currently completing a parity check again. So far 0 errors, @ ~45% with another 12 hours or so to go. Any help would be appreciated. I'm very tempted to reset the entire environment and start from scratch, but I'd really prefer to avoid that. Quote Link to comment
Squid Posted August 10, 2023 Share Posted August 10, 2023 Aug 10 16:29:44 Stark kernel: BTRFS critical (device sdd1): corrupt leaf: block=5451645927424 slot=57 extent bytenr=5341179084800 len=16384 unknown inline ref type: 255 First thing to do is run Memtest from the boot menu for at least a couple of passes as corruption is usually caused by bad memory. If you're currently booting via UEFI you will have to temporarily switch to legacy boot or setup a new flash drive from https://www.memtest86.com/ Quote Link to comment
Nomar1245 Posted August 10, 2023 Author Share Posted August 10, 2023 Thanks. It makes me feel a bit better to find out I was on the right path. At this point it seems the board or CPU are failing. The memory that is installed is less than a week old. Quote Link to comment
JonathanM Posted August 10, 2023 Share Posted August 10, 2023 33 minutes ago, Nomar1245 said: The memory that is installed is less than a week old. New doesn't mean good. The first thing you need to do with new memory is a memtest. Quote Link to comment
Nomar1245 Posted August 10, 2023 Author Share Posted August 10, 2023 While I understand what your saying, the odds of the same exact problem happening before and after replacing the RAM has to be astronomical. Quote Link to comment
JonathanM Posted August 10, 2023 Share Posted August 10, 2023 2 minutes ago, Nomar1245 said: While I understand what your saying, the odds of the same exact problem happening before and after replacing the RAM has to be astronomical. memtest also checks the memory controller and the data path. It would still be a good test. If it fails you have a smoking gun to investigate, even if it turns out all your RAM sticks are good, the memtest could still fail. Memory timing or voltage in BIOS could be wrong, or just not stable at current values. If memtest passes 24 hours with no errors, then you have another data point to help you diagnose things. Quote Link to comment
Nomar1245 Posted August 10, 2023 Author Share Posted August 10, 2023 Well there we are on the same page. I’ve been running the test for a bit more than an hour now. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.