flurec Posted July 3 Share Posted July 3 (edited) tower-diagnostics-20240703-0910.zip Starting new topic because of my network going down. Didn't see that mentioned in the other thread. I have been having daily, at times twice daily, unclean shutdowns. Completely random times of the day. At this same time my entire home network goes down. I use Home Assistant and Ubiquiti (unifi controller) in a docker and I can see that the network goes down for around 10 minutes. The longest it will go without an unclean shutdown is 3 days but normally it is daily. Memtest is fine. VMs turned off. Macvlan is the network type but I followed the alternative setup instructions for using that. One thing I have done is reserved the IP addresses for my dockers in opnsense by mac just to keep track of IPs on the network . I don't know if that is best practice. Server is plugged into a UPS. Edited July 3 by flurec spelling Quote Link to comment
JorgeB Posted July 3 Share Posted July 3 5 minutes ago, flurec said: I have been having daily, at times twice daily, unclean shutdowns. Do you mean the server reboots by itself? If yes that is almost always a hardware issue. Quote Link to comment
flurec Posted July 3 Author Share Posted July 3 This might be a dumb question- Is there a way for me to know if the server is actually rebooting? Whatever is happening is happening on its own and randomly. I guess my question it- If I get an unclean shutdown, without taking any prior action, does that mean it rebooted on its own or can an unclean shutdown be triggered without it actually rebooting? What would be the order of replacing equipment? Memtest checked out so- PSU, ethernet card, CPU, then motherboard? Would that be a good approach? I built this and the CPU, HBA controller, and motherboard were purchased used. Quote Link to comment
JorgeB Posted July 3 Share Posted July 3 1 hour ago, flurec said: This might be a dumb question- Is there a way for me to know if the server is actually rebooting? Check the uptime: Quote Link to comment
flurec Posted July 3 Author Share Posted July 3 Thanks- I looked at the uptime and it is since this morning when I had the last unclean shutdown. I will replace some hardware. Is there anything I can look for in the logs that might point me in the right direction as far as a hardware problem? Quote Link to comment
JorgeB Posted July 3 Share Posted July 3 Usually there's nothing logged with this issue, I do see btrfs detecting data corruption: Jul 3 06:22:12 Tower kernel: BTRFS info (device sdk1): bdev /dev/sdk1 errs: wr 0, rd 0, flush 0, corrupt 1467, gen 0 This is usually RAM related, and memtest is only definitive if it finds errors, if you have multiple sticks try using the server with just one, if the same try with a different one, that will basically rule out bad RAM. Quote Link to comment
flurec Posted July 3 Author Share Posted July 3 I have two sticks so I will try each one. Thanks! Quote Link to comment
flurec Posted July 5 Author Share Posted July 5 I removed each stick and I had unclean shutdowns on both sticks after 3-5 hours. I've attached the most recent diagnostics log. Anything new here? I guess there's is a chance both sticks are bad but that seems unlikely. tower-diagnostics-20240704-2016.zip Quote Link to comment
JorgeB Posted July 5 Share Posted July 5 Most likely it's not the RAM, PSU, board or CPU would be the next suspects. Quote Link to comment
Solution flurec Posted July 17 Author Solution Share Posted July 17 So I feel really dumb. I have the unraid server and my ubiquiti AP plugged into the same UPS. That UPS does not indicate that the battery needs replacing and it appears to be functioning normally. Well, looks like this UPS was resetting itself almost daily thus killing the server and AP and getting those unclean shutdowns. Moral of the story is to bypass your UPS first. Thanks for your help Jorge. 1 Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.