Jump to content

Unraid Crashes When Starting The Array


Go to solution Solved by Vyktrii,

Recommended Posts

Few days ago i was encountering major crashing issues 

I managed to temporarily solve it by using a usb eth adapter. but i kind of solved it by buying a new router and changing the USB stick of unraid, these kind of crashes stopped for me but other kind of crashes have started

 

But over few days, sometimes my unraid crashes and its mostly after i do some activity on docker like uninstalling or stopping a container, once it crashes, before crashing few cpu threads get stuck at 100% for around 3-4mins and then it crashes, then my unraid keeps crashing every time array is started, the only workaround is to delete docker image and reinstall all plugins/containers. 

 

Yesterday, my unraid crashed after i stopped a docker container and it kept crashing, i updated from 6.11.5 to 6.12 but it still kept crashing, but deleteing docker image fixed it, i have had this issue couple of times where i need to delete the docker image, about troubleshooting hardware (my motherboard has been replaced and its new, my ram passes memtest for 12hours and swapping my ram sticks from my main pc doesnt help with the crashes, also i tested my CPU my running prime 95 for 1 hour and then running cinebench 3 times, i dont know how i can test cpu on unraid so i just transcoded 3 4k streams to 1080p using plex to test CPU [it pushes the cpu to constant 94-98%]), the only thing that fixes this is deleting and reinstalling docker image

 

syslog-192.168.1.10.log kassandra-diagnostics-20230621-1317.zip

Link to comment
25 minutes ago, JorgeB said:

Call traces I see are Nvidia related, does it help if you uninstall the Nvidia driver?

 

Im unabke to replicate the issue as i fixed it by nuking docker image, but will try uninstalling nvidia driver, i also forgot to mention that when this issue occurs, it sometimes kills even my new router but it sometimes doesnt (last issue consistently killed my router)

 

Also this post has similar symptoms as me, https://forums.unraid.net/topic/135151-docker-unresponsive-unraid-at-100-cpu-eventual-system-crash/

Link to comment
7 hours ago, JorgeB said:

Call traces I see are Nvidia related, does it help if you uninstall the Nvidia driver?

I encountered another crash, this time reinstalling docker did not help much as it crashed again after few minutes, uninstalling nvidia driver also did not help, i also removed the GPU and tested it on another PC by running benchmarks on it for an hour, it did not crash, so im guessing the only thing that i can now replace is CPU itself or could it still be GPU ?

Link to comment
On 6/21/2023 at 9:16 PM, JorgeB said:

Difficult to say for certain.

I replaced both the CPU and GPU (also the PSU), i still get crashes within ~10mins starting the array on 6.12.1, however the errors are now different, rolling back to 6.11.5 makes my system much more stable but i still get crashes after few hours(sometimes it crashes as soon as array starts, sometimes it doesnt crash for hours), however it crashes so bad that im unable to find crash details on log files, but its still call trace errors, the crashes are very unpredictable but they occur mostly when docker containers start, (sometimes a specific docker container crashes it, sometimes it doesnt)

 

error.txt

syslog-192.168.1.10.log

kassandra-diagnostics-20230629-0026.zip

Edited by Vyktrii
Link to comment
2 minutes ago, JorgeB said:

With all the call traces still looks like a hardware issue to me.

i now have a new mobo, new cpu and i also swapped ram sticks from my main rig to unraid rig, its basically new hardware, which other component can result in call traces ?

Link to comment
37 minutes ago, JorgeB said:

Most often RAM, CPU and/or board.

since i have swapped/replaced everything, is it a possibility that linux is not compatible with my mobo (B760), are their any essential settings in bios that can possibly cause these crashes ?, i have XMP disabled, i did not change any other setting

Link to comment

at this point its basically a new pc lol, since the start i chaged cpu,gpu,mobo,psu and even swapped ram sticks, anyways i found a temporary fix, I copied appdata, nuked the cache drive (precleared it), and rebuilt docker and copied back the appdata. My rig has been stable for quite some hours and only unraid api is giving me errors, lets see how long will it hold up

Edited by Vyktrii
Link to comment
21 hours ago, JorgeB said:

It's possible, do you have another PC you could test Unraid on?

I made some progress, nuking and rebuilding the cache drive solves it but its a temporary workaround, it works fine but when i try shutting down unraid, its unable to unmount it and gives me errors like these

/mnt/cache Jun 30 12:21:07 Kassandra root: umount: /mnt/cache: target is busy. Jun 30 12:21:07 Kassandra emhttpd: shcmd (68357): exit status: 32 Jun 30 12:21:07 Kassandra emhttpd: Retry unmounting disk share(s)... Jun 30 12:21:12 Kassandra emhttpd: Unmounting disks... Jun 30 12:21:12 Kassandra emhttpd: shcmd (68358): umount

 

after the next few reboots i start getting call trace errors after some time, i tried 2 ssds, still same issue, also running fuser -v /mnt/cache gives me "USER PID ACCESS COMMAND /mnt/cache: root kernel mount /mnt/cache" this result, i think my crashes have something to do with my cache drive getting corrupted after few reboots, im unable to find ouit reason why my cache drive wont unmount properly, my timout is also 300secs instead of default 90

 

also unraid api plugin was giving me GPF errors, i uninstalled it, i think it helped a bit, unraid api gave me this error kernel: traps: unraid-api[5954] general protection fault ip:1bf921c sp:7ffcd57f2548 error:0 in unraid-api[91c000+167b000]

 

so as long as i dont reboot or shutdown its working fine lol

kassandra-diagnostics-20230630-1232.zip

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...