asnt Posted March 28, 2023 Share Posted March 28, 2023 (edited) I've been dealing with my server crashing several times last month. At first, I didn't pay much attention since my server is usually stable, and I never had any problems with it. But since it started crashing again and again, I started looking into what could be the reason. Today I had another crash and decided to look at the logs. I saw a lot of errors that, unfortunately, I don't know what it means. Can someone help me? The crash today resulted in an unclean shutdown, and I am currently in the middle of a parity check. I tested my all my drives and smart data came back OK. I have attached the diagnostics files. The errors I mentioned are shown around the 18:29:18 timestamp of the syslog file. Thank you m93p-diagnostics-20230327-1842.zip Edited April 11, 2023 by asnt Marking as solved Quote Link to comment
JorgeB Posted March 28, 2023 Share Posted March 28, 2023 Enable the syslog server and post that after a crash. Quote Link to comment
hunter69 Posted March 29, 2023 Share Posted March 29, 2023 If this is any help, I have the Plex docker. I have been having weired reboot crashing. I traced it to the Plex docker. I have the linuxserver.io version. I am in the process of figuring out what to do next. I believe it was an update to the docker that broke it. Quote Link to comment
asnt Posted March 30, 2023 Author Share Posted March 30, 2023 On 3/28/2023 at 1:07 AM, JorgeB said: Enable the syslog server and post that after a crash. Thank you! I just enabled it (unfortunately not fast enough, since I had another crash before reading your reply). On 3/29/2023 at 7:09 AM, hunter69 said: If this is any help, I have the Plex docker. I have been having weired reboot crashing. I traced it to the Plex docker. I have the linuxserver.io version. I am in the process of figuring out what to do next. I believe it was an update to the docker that broke it. Thanks, that's good to know. I have the official Plex version, I'll keep it running until I have another crash, to see if I can get it logged as per JorgeB's suggestion. I might stop Plex after and see if the server is stable again. I am getting weird messages from Unraid that my docker size is over 70% and than it returns to normal. I used to get this messages when I was updating a docker, but never got it when the server is idle. Quote Link to comment
hunter69 Posted March 31, 2023 Share Posted March 31, 2023 Well I know it is the plax docker. I have done a lot of uninstalling and installing other plex dockers. I am not making any progress resolving the crash. If the plex docker is enabled it crashes during the default maintenance window 3-5am. I installed plex on another computer and had it scan the same movie file and it did not crash. So I think that eliminates corrupt files. I am stumped on the next step. I enabled the syslog server but I do not see any logs in the share i told it to use. Have you made any progress? Quote Link to comment
hunter69 Posted March 31, 2023 Share Posted March 31, 2023 Actually I found the syslog, can someone take a look and see if you see a problem syslog-192.168.2.97.log Quote Link to comment
hunter69 Posted April 2, 2023 Share Posted April 2, 2023 Now my server is just crashing. I managed to grab a log. Would someone look at the bottom of the log and see if you can tell what is going on syslog-192.168.2.97.log Quote Link to comment
JorgeB Posted April 2, 2023 Share Posted April 2, 2023 There's nothing relevant logged, this usually points to a hardware problem, one thing you can try is to boot the server in safe mode with all docker/VMs disabled, let it run as a basic NAS for a few days, if it still crashes it's likely a hardware problem, if it doesn't start turning on the other services one by one. Quote Link to comment
hunter69 Posted April 2, 2023 Share Posted April 2, 2023 I have done some pretty drastic troubleshooting. I changed power supply, no change. I moved the drives to the onboard sata controller, no change. I rebuilt a new install of unraid with a backup usb, no change. I downloaded the diagnostics. The behavior did change recently, basically I lose ping and cannot access the server. tower-diagnostics-20230402-1109.zip Quote Link to comment
hunter69 Posted April 2, 2023 Share Posted April 2, 2023 I guess if this is a hardware issue, then its either ram or motherboard. Quote Link to comment
hunter69 Posted April 2, 2023 Share Posted April 2, 2023 (edited) I change ram, no defference. I remove unneccessary drives no difference Edited April 2, 2023 by hunter69 Quote Link to comment
asnt Posted April 2, 2023 Author Share Posted April 2, 2023 On 3/31/2023 at 7:25 AM, hunter69 said: Well I know it is the plax docker. I have done a lot of uninstalling and installing other plex dockers. I am not making any progress resolving the crash. If the plex docker is enabled it crashes during the default maintenance window 3-5am. I installed plex on another computer and had it scan the same movie file and it did not crash. So I think that eliminates corrupt files. I am stumped on the next step. I enabled the syslog server but I do not see any logs in the share i told it to use. Have you made any progress? I didn't have a crash after I turned on the syslog server. I did update the plex docker two days ago, so if it was a problem in this docker, it is fixed. I'll update here if something changes. I hope you can identify the problem with your server. Quote Link to comment
hunter69 Posted April 2, 2023 Share Posted April 2, 2023 If I start the array in maintanence mode, it does not crash. I wil say tht ping rates fluctuate. i have 2-m.2 cache drives. What would be the best/safest way to eliminate the m.2 and yet still be able to renabe the drives as cache in the furture if they prove not to be an issue? One cache has my domains and appdata shares. Quote Link to comment
JorgeB Posted April 3, 2023 Share Posted April 3, 2023 You can just unassign the devices. Quote Link to comment
hunter69 Posted April 4, 2023 Share Posted April 4, 2023 So I replaced the M.2. The server continues to crash every 3 minutes. I am down to motherboard or processor. I am looking for a replacement motherboard. Got any ideas to determine processor versus motherboard? This is a nice versatile setup. It is a Xeon 1290p. Sad I am having these hardware problems after only 3 years of use. Quote Link to comment
asnt Posted April 5, 2023 Author Share Posted April 5, 2023 On 3/28/2023 at 1:07 AM, JorgeB said: Enable the syslog server and post that after a crash. After 3 days, another crash. Here is the syslog and diagnostics after the crash. Do you see something that stands out to be the reason for the crash? thanks! syslog-192.168.50.156.log m93p-diagnostics-20230405-0926.zip Quote Link to comment
Solution JorgeB Posted April 5, 2023 Solution Share Posted April 5, 2023 Mar 30 20:42:02 M93p kernel: macvlan_broadcast+0x10a/0x150 [macvlan] Mar 30 20:42:02 M93p kernel: macvlan_process_broadcast+0xbc/0x12f [macvlan] Macvlan call traces are usually the result of having dockers with a custom IP address and will end up crashing the server, switching to ipvlan should fix it (Settings -> Docker Settings -> Docker custom network type -> ipvlan (advanced view must be enabled, top right)) Quote Link to comment
asnt Posted April 5, 2023 Author Share Posted April 5, 2023 1 hour ago, JorgeB said: Mar 30 20:42:02 M93p kernel: macvlan_broadcast+0x10a/0x150 [macvlan] Mar 30 20:42:02 M93p kernel: macvlan_process_broadcast+0xbc/0x12f [macvlan] Macvlan call traces are usually the result of having dockers with a custom IP address and will end up crashing the server, switching to ipvlan should fix it (Settings -> Docker Settings -> Docker custom network type -> ipvlan (advanced view must be enabled, top right)) Thank you! I switched to ipvlan and hopefully this willl fix the crashes. 1 Quote Link to comment
hunter69 Posted April 6, 2023 Share Posted April 6, 2023 To update I finally figured it out. To be short it was my fault. I hae a LSI 9220-8i. I was using a expander from long ago when I had over 10 drives. Things worked well back then. Today, Unraid did not like this expander. In fact I figured it out after (I can't count how many hours of research) I saw a picture of the same expander with the caption "do not use with Unraid". So I reconfigured my drives and everything returned to normal. By the way from anyone who is doing research because of strange crashes, there was nothing in the logs to indicate this was the issue. Quote Link to comment
hunter69 Posted April 7, 2023 Share Posted April 7, 2023 I am wrong the issue has not been resolved. My unraid server continues to crash. Here is what I observed: Scenario 1- I have a LSI SAS9220-8i. I eliminated this by moving all the drives to the onboard motherboard sata controller. What I observed here is the server would crash every 3 minutes. After researching I started thinking this issue could be caused by using all the onboard sata ports plus having 2 m.2 drives. I have read that when using the m.2 slots coulkd affect some of the sata ports. Am I correct or incorrect in my thinking? Scenario 2- SO I moved and reconfigured my drive from the onboard controller to the LSI SAS9220-8i. It stopped crashing, or so I thought. When I reinstalled the plex docker, the server began crashing again. Note this is where it all started. Now I am wondering if this could be the issue. I have the LSI SAS9220-8i in a PCI X1 slot. Note I have had this card in the same slot for 3 years with nothing but stability. I have owned the card since unraid version 4. Could the issue be that the LSI card in a PCI X1 slot be the root cause. Interesting fact what I observed when the server crashed in each scenario Scenario 1- The server would crash as in I could not access the gui and could not ping the server. I had a monitor on the server. I could see the screen but could not type on it. Scenario 2- When the server crashes it powers off. I am down to the following possible root cause The LSI board The Motherboard The processor I have changed ram, power supply and eliminated all unneccessary drives. Any and all ideas are welcome Quote Link to comment
JorgeB Posted April 7, 2023 Share Posted April 7, 2023 7 minutes ago, hunter69 said: I have read that when using the m.2 slots coulkd affect some of the sata ports. Am I correct or incorrect in my thinking? Using SATA M.2 devices will usually disable SATA ports, this is not a problem with NMVe, nor a stability concern, the ports work or not, nothing more. 9 minutes ago, hunter69 said: Could the issue be that the LSI card in a PCI X1 slot be the root cause. x1 slot will limit bandwidth, but it should not cause any stability issues, again it works or not. Could just be some hardware going bad, unfortunately not usually easy to tell which without starting to swap some parts around. Quote Link to comment
hunter69 Posted April 7, 2023 Share Posted April 7, 2023 (edited) I'm down to big items the motherboard and processor. I have a replacement motherboard on the way but if its not that, I could buy a different/replacement LSI card. Other than that might begin to get real expensive. What bothers me, is when I find a server is shut off, I know something is up with the power supply. When I did the swap I had an older 650 psu from my old unraid server. It lack 1 -4 pin 12 volt connector. I had read depending on CPU the motherboard might not need that connector. Well the server booted as normal and began crashing just the same. Do you think I am pretty solid that it isn't the power supply. Edited April 7, 2023 by hunter69 Quote Link to comment
hunter69 Posted April 10, 2023 Share Posted April 10, 2023 Well it has been stable for a few days. I had ordered new ram and that is what fixed it. Thanks to eveyone that took the time to help me out! 1 Quote Link to comment
asnt Posted April 11, 2023 Author Share Posted April 11, 2023 My server has been stable for 6 days now after switching to ipvlan. Thank you for the help! I will change the post to solved. 1 Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.