Iceman24 Posted August 25, 2022 Share Posted August 25, 2022 (edited) Diagnostics and syslog that I mirrored to flash so that I had more data when/if it happened again. The lockup (as I'll call it) occurred between 18:14:30-18:16:50. I know as I have a monitor checking for a Docker app that runs that goes down just like everything else when it happens. I cannot even ping the server itself whatsoever once it happens, yet it was still logging for the 15 minutes or so until I hard rebooted it. It has happened about 5 times now over 2 weeks. It has been a few days, thought it may have been resolved. I reseated the memory and ran MemTest 9 until it said PASS after the 4 passes it does. Originally, unless unrelated, this problem started with something happening overnight, couldn't access anything, after reboot I couldn't see files on cache drive. After xfs repair, I could, everything okay. Then 2 days or so later the regular lockups occurred, maybe 2 days, 24 hours, 5 hours in between, very random, wasn't usually using server firsthand when it happens except once. Nothing stood out in syslog to me around lockup time, but it wasn't a log I am used to reading. I did see once I think that memory log showed 100% full on dashboard while it still kind of worked. I tried to reboot, wouldn't do it, had to hard reboot. I didn't see it seem to fill up quickly when monitoring it. At that time, CompreFace was pegging CPU, but even after keeping off, it still locked up once. I'd appreciate any help, please. This is driving me mad. I rely on this server so much. Thanks. just over 2.5 years old. No issues up until this. Edited September 4, 2022 by Iceman24 Quote Link to comment
JorgeB Posted August 26, 2022 Share Posted August 26, 2022 Log shows many NMI events before the crash, and before those events are some WSD strange errors, WSD is known to some times cause high CPU usage, so I would starta with disabling that: Settings -> SMB Settings 1 Quote Link to comment
Iceman24 Posted August 26, 2022 Author Share Posted August 26, 2022 (edited) Thanks, JorgeB. I will try that now. I did notice the CompreFace container giving me issues again, this time not stopping. It was holding up almost everything else. I had to stop all other containers individually. I can't even kill the process. I had to hard reboot it again. Ugh. I've had this container for months, no recent update and like I said before, problem happens without it running. Edit: I have had the option "-i br0" on WSD this whole time. I don't recall why, read about it years ago. Edited August 26, 2022 by Iceman24 Quote Link to comment
Iceman24 Posted August 26, 2022 Author Share Posted August 26, 2022 Another thing is that every time now I have to turn off Docker service, go into network settings, modify DNS just to the point that after I put settings back, I apply. I didn't even change anything. Then restart Docker service, then I have proper DNS connectivity the way I have things setup. What issues I have varies, but sometimes it's as far as delays to ping anything on the Internet at all, but it's all good once I do what I just said. Quote Link to comment
Iceman24 Posted August 29, 2022 Author Share Posted August 29, 2022 (edited) It happened again just now. More diagnostics and syslog attached. 20:11:30-20:13:15 time of lockup. I did notice the "rcu_sched self-detected stall on CPU" error, some other info I found on it, but I will leave that to someone more knowledgeable to advise me on how to move forward with my particular server. Thanks. I could update from 6.9.2, was hesitant due to some issues people had. Edited September 4, 2022 by Iceman24 Quote Link to comment
Iceman24 Posted August 29, 2022 Author Share Posted August 29, 2022 And it crashed again already while using Plex...🤬 Quote Link to comment
Solution JorgeB Posted August 29, 2022 Solution Share Posted August 29, 2022 Aug 26 12:18:49 BlackIce kernel: macvlan_broadcast+0x10e/0x13c [macvlan] Aug 26 12:18:49 BlackIce kernel: macvlan_process_broadcast+0xf8/0x143 [macvlan] Macvlan call traces are usually the result of having dockers with a custom IP address and will end up crashing the server, upgrading to v6.10 and switching to ipvlan should fix it (Settings -> Docker Settings -> Docker custom network type -> ipvlan (advanced view must be enabled, top right)). 2 Quote Link to comment
Iceman24 Posted August 29, 2022 Author Share Posted August 29, 2022 Thanks, done, will wait to see what happens. 1 Quote Link to comment
Iceman24 Posted September 4, 2022 Author Share Posted September 4, 2022 Has been good, issue appears resolved. Thanks, again. Glad to be over this. 1 Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.