6.10.3 Unraid unstable and crashes!

June 27, 20224 yr

Hi, I had the occasional crash with the RC releases, but it seems to have become more unstable with the stable releases! Not sure whats going on but it crashed on friday and nothing was working, no GUI and no respons on command line! So unclean shutdown! And same again this morning! I was suspicious of CA Backup because the crashes seemed to happen at 6:00am on the RC's which is when it runs but this morning it wasn't that time! So hoping someone cleverer than me can spot something? I've attached diagnostics from Friday and today.

Thanks, Tim

tower-diagnostics-20220624-1452.zip tower-diagnostics-20220627-1114.zip

Quote

June 27, 20224 yr

Community Expert

Enable the syslog server and post that after a crash.

Quote

June 27, 20224 yr

Author

Already enabled here is the log! Sorry should have thought of including that!

syslog-10.19.64.2.log

Quote

June 27, 20224 yr

Community Expert

May 14 00:41:06 Tower kernel: macvlan_broadcast+0x116/0x144 [macvlan]
May 14 00:41:06 Tower kernel: macvlan_process_broadcast+0xc7/0x110 [macvlan]

Macvlan call traces are usually the result of having dockers with a custom IP address, uswitching to ipvlan should fix it (Settings -> Docker Settings -> Docker custom network type -> ipvlan (advanced view must be enable, top right)), or see below for more info.

https://forums.unraid.net/topic/70529-650-call-traces-when-assigning-ip-address-to-docker-containers/

P.S. Unrelated but you're also running out of RAM.

Quote

1

June 27, 20224 yr

Author

Ok, thanks for that, I've changed the Plex docker back to host as that was the only one with a static IP, hopefully that'll fix it!

I'll keep an eye on the RAM it's probably when I'm running Windows 11 VM, if so I'm not worried, but I'll check that it's nothing else!

Thanks again for your help!

Cheers,

Tim

Quote

June 30, 20224 yr

Author

It crashed again last night, so sadly it seems that hasn't fixed the issue? It happened at around 20:30 - 20:45, the system wasn't totally hung like before though and I was able to restart properly! Logs attached, hopefully there is something there?

Cheers,

Tim

syslog-10.19.64.2.log tower-diagnostics-20220630-0839.zip

Quote

June 30, 20224 yr

Community Expert

13 minutes ago, MothyTim said:

It happened at around 20:30 - 20:45

Unfortunately there's nothing, I assumed it happen between these lines:

Jun 29 19:53:06 Tower autofan: Highest disk temp is 40C, adjusting fan speed from: 163 (63% @ 1548rpm) to: 140 (54% @ 1336rpm)
Jun 29 20:50:13 Tower kernel: mdcmd (36): set md_write_method 1

Quote

June 30, 20224 yr

Author

Ok, thanks for looking so quickly. I think it was something with docker as Plex became unresponsive and I tried to restart its container but it through a server error and the container wouldn't start. I guess I could have restarted the docker service but opted to restart the whole machine!

Cheers,

Tim

Quote

July 1, 20224 yr

Author

Ok so again a problem I think the GUI crashed as all dockers etc seemed to still be working but no GUI, system wouldn't restart so had to hit the oh shit button! Logs attached! I'm wondering if I'll have to give up on 6.10 and go back to 6.9?

Cheers,

Tim

tower-diagnostics-20220701-1811.zip syslog-10.19.64.2.log

Quote

July 1, 20224 yr

Community Expert

Jul  1 13:48:56 Tower kernel: general protection fault, probably for non-canonical address 0x9c0101000034: 0000 [#1] SMP PTI

Now there's something, but no Linux module/driver is mentioned, so don't know if it was a software or hardware issue.

Quote

July 1, 20224 yr

Author

Ok thanks for looking, annoying that it’s not more specific!

cheers,

Tim

Quote

July 1, 20224 yr

Community Expert

What do you get from command line with this?

df -h /

Check periodically to see if usage is growing.

Quote

July 2, 20224 yr

Author

root@Tower:~# df -h /
Filesystem      Size  Used Avail Use% Mounted on
rootfs           16G  953M   15G   6% /
root@Tower:~#

Quote

July 2, 20224 yr

Author

root@Tower:~# df -h /
Filesystem Size Used Avail Use% Mounted on
rootfs 16G 959M 15G 7% /
root@Tower:~#

Quote

July 2, 20224 yr

Community Expert

rootfs is the space the OS uses for its own files, and if it fills all sorts of things can go wrong since the OS no longer has any space to work.

Sometimes users will have something misconfigured such as a docker host mapping to a path that isn't actual mounted storage, which would be in rootfs.

Your usage has grown a little but nowhere near full yet and maybe it won't fill up. Just another thing you can check on if other things don't give a clue.

Quote

July 2, 20224 yr

Author

Ok thanks for the info, just run it again.

root@Tower:~# df -h /
Filesystem      Size  Used Avail Use% Mounted on
rootfs           16G  953M   15G   6% /
root@Tower:~#

Cheers,

Tim

Quote

July 3, 20224 yr

Author

And this mornings.

root@Tower:~# df -h /
Filesystem      Size  Used Avail Use% Mounted on
rootfs           16G  980M   15G   7% /
root@Tower:~#

Cheers,

Tim

Quote

July 3, 20224 yr

Community Expert

looks normal

Quote

July 3, 20224 yr

Author

Ok thanks

Quote

August 3, 20223 yr

Author

Ok, so it stayed up for a whole month and has crashed again this morning! So bad this time that it won't boot back up again!!!! This is beyond frustrating now, I should have guessed a problem was looming as CA Backup has failed to restart my docker containers for 2 days on the trot. I ran diagnostics before reboot as all the USB derices had disappeared from USB Manager!

Really hope something shows in the logs this time.

Cheers,

Tim

tower-diagnostics-20220803-1238.zip

Quote

August 3, 20223 yr

Community Expert

Aug  1 12:35:11 Tower kernel: general protection fault, probably for non-canonical address 0x9c0101000034: 0000 [#1] SMP PTI

Aug  1 13:01:44 Tower kernel: irq 16: nobody cared (try booting with the "irqpoll" option)

A couple of days ago there was the same error as last time and then IRQ 16 also got disabled, possibly not a big issue since the server worked for 2 more days, main issue was that this USB controller stopped working:

Aug  3 12:35:33 Tower kernel: xhci_hcd 0000:00:14.0: xHCI host controller not responding, assume dead

And the flash drive was using it, so after that Unraid cannot continue to work correctly.

Quote

August 3, 20223 yr

Author

I've got a bad feeling that the USB on the motherboard has failed/is failing as the mouse doesnt light up! Could be a port failed just trying it with minimal stuff plugged in! Unraid USB stick mounts and is brouseable on my Mac, do you think it might be corupted? Should I reflash it?

Thanks for your help again.

Cheers,

Tim

Quote

August 3, 20223 yr

Community Expert

Run chkdsk on it but the flash should be OK.

Quote

August 3, 20223 yr

Author

Ok ran first aid on stick on my mac and its booted up, the attached screen shows an error though: XFS (md3): Metadata corruption detected at xfs_dinode_verify+0xa4/0x56f [xfs], inode 0x5c1bf6 dinode

XFS (md3): Unmount and run xfs_repair

Then lines of hex code!

What do I do about that?

Thanks,

Tim

Quote

August 3, 20223 yr

Community Expert

3 minutes ago, MothyTim said:

What do I do about that?

3 minutes ago, MothyTim said:

Unmount and run xfs_repair

Check filesystem on disk3.

Quote

6.10.3 Unraid unstable and crashes!

Featured Replies

Solved by JorgeB

Join the conversation

Account

Navigation

Search

Configure browser push notifications

Chrome (Android)

Chrome (Desktop)

Safari (iOS 16.4+)

Safari (macOS)

Edge (Android)

Edge (Desktop)

Firefox (Android)

Firefox (Desktop)