6.10.3 Unraid unstable and crashes!


Go to solution Solved by JorgeB,

Recommended Posts

Hi, I had the occasional crash with the RC releases, but it seems to have become more unstable with the stable releases! Not sure whats going on but it crashed on friday and nothing was working, no GUI and no respons on command line! So unclean shutdown! And same again this morning! I was suspicious of CA Backup because the crashes seemed to happen at 6:00am on the RC's which is when it runs but this morning it wasn't that time! So hoping someone cleverer than me can spot something? I've attached diagnostics from Friday and today. :)

Thanks, Tim

tower-diagnostics-20220624-1452.zip tower-diagnostics-20220627-1114.zip

Link to comment
May 14 00:41:06 Tower kernel: macvlan_broadcast+0x116/0x144 [macvlan]
May 14 00:41:06 Tower kernel: macvlan_process_broadcast+0xc7/0x110 [macvlan]

 

Macvlan call traces are usually the result of having dockers with a custom IP address, uswitching to ipvlan should fix it (Settings -> Docker Settings -> Docker custom network type -> ipvlan (advanced view must be enable, top right)), or see below for more info.

https://forums.unraid.net/topic/70529-650-call-traces-when-assigning-ip-address-to-docker-containers/

See also here:

https://forums.unraid.net/bug-reports/stable-releases/690691-kernel-panic-due-to-netfilter-nf_nat_setup_info-docker-static-ip-macvlan-r1356/

 

P.S. Unrelated but you're also running out of RAM.

 

  • Thanks 1
Link to comment

Ok, thanks for that, I've changed the Plex docker back to host as that was the only one with a static IP, hopefully that'll fix it!

 

I'll keep an eye on the RAM it's probably when I'm running Windows 11 VM, if so I'm not worried, but I'll check that it's nothing else!

 

Thanks again for your help!

Cheers,

Tim

Link to comment
13 minutes ago, MothyTim said:

It happened at around 20:30 - 20:45

Unfortunately there's nothing, I assumed it happen between these lines:

 

Jun 29 19:53:06 Tower autofan: Highest disk temp is 40C, adjusting fan speed from: 163 (63% @ 1548rpm) to: 140 (54% @ 1336rpm)
Jun 29 20:50:13 Tower kernel: mdcmd (36): set md_write_method 1

 

Link to comment

Ok, thanks for looking so quickly. I think it was something with docker as Plex became unresponsive and I tried to restart its container but it through a server error and the container wouldn't start. I guess I could have restarted the docker service but opted to restart the whole machine!

Cheers,

Tim 

Link to comment

rootfs is the space the OS uses for its own files, and if it fills all sorts of things can go wrong since the OS no longer has any space to work.

 

Sometimes users will have something misconfigured such as a docker host mapping to a path that isn't actual mounted storage, which would be in rootfs.

 

Your usage has grown a little but nowhere near full yet and maybe it won't fill up. Just another thing you can check on if other things don't give a clue.

Link to comment
  • 1 month later...

Ok, so it stayed up for a whole month and has crashed again this morning! So bad this time that it won't boot back up again!!!! This is beyond frustrating now, I should have guessed a problem was looming as CA Backup has failed to restart my docker containers for 2 days on the trot. I ran diagnostics before reboot as all the USB derices had disappeared from USB Manager!

 

Really hope something shows in the logs this time.

Cheers,

Tim

tower-diagnostics-20220803-1238.zip

Link to comment
Aug  1 12:35:11 Tower kernel: general protection fault, probably for non-canonical address 0x9c0101000034: 0000 [#1] SMP PTI

 

Aug  1 13:01:44 Tower kernel: irq 16: nobody cared (try booting with the "irqpoll" option)

 

A couple of days ago there was the same error as last time and then IRQ 16 also got disabled, possibly not a big issue since the server worked for 2 more days, main issue was that this USB controller stopped working:

 

Aug  3 12:35:33 Tower kernel: xhci_hcd 0000:00:14.0: xHCI host controller not responding, assume dead

 

And the flash drive was using it, so after that Unraid cannot continue to work correctly.

Link to comment

I've got a bad feeling that the USB on the motherboard has failed/is failing as the mouse doesnt light up! Could be a port failed just trying it with minimal stuff plugged in! Unraid USB stick mounts and is brouseable on my Mac, do you think it might be corupted? Should I reflash it?

 

Thanks for your help again.

Cheers,

Tim

Link to comment

Ok ran first aid on stick on my mac and its booted up, the attached screen shows an error though: XFS (md3): Metadata corruption detected at xfs_dinode_verify+0xa4/0x56f [xfs], inode 0x5c1bf6 dinode

XFS (md3): Unmount and run xfs_repair

 

Then lines of hex code!

 

What do I do about that?

 

Thanks,

Tim

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.