Tithonius

Members
  • Posts

    25
  • Joined

  • Last visited

Recent Profile Visitors

The recent visitors block is disabled and is not being shown to other users.

Tithonius's Achievements

Noob

Noob (1/14)

1

Reputation

  1. This is definitely not a thermal crash.. the card runs a nice cool 45C all the time under full load transcoding. Also, to be clear, the server didn't crash, just the GUI I'm not overclocking anything, so the motherboard should be totally within thermal limits easy (its just a i3-10100) and the CPU and stuff have never ever had a thermal crash or issue being hot at all before.
  2. Okay, I'm back. and GOOD NEWS EVERYONE! I fixed the random hard crashes that I've been having! So it seems that unraid didn't like the "mismatched" RAM in my system. (it was the same model number and everything, same timings, the works, but 2 of my 4 sticks had different pcb layouts.) That being said, I'm not out of the woods yet. I had a good run of like 10 days in a row with no crashes, and I'm back to square one before all this started. So now i am back to the issue where rarely i get home, hit refresh on my webui and get met with a 500 internal server error. But, now I have proper logging setup, and was able to capture the error in a syslog. I think this shows what's going on. (Hopefully) Would love if you guys could take a look. It does look to me like an nvidia issue, hence why I posted here. Thanks again. syslog-192.168.1.10.log
  3. Okay, so my journey with my server crashing starts from installing a GPU. about a month ago I replaced my 1050ti in my gaming rig and decided it would be nice to put that GPU in my server. The server ran just fine for a few weeks, but then crashed. I thought, well crap, maybe Unraid doesn't like my GPU. So I uninstalled the Nvidia driver plugin (I was using the GPU for transcoding in Unmanic) and I removed the GPU from the system. For a couple more days I didn't have any more crashes. Then out of nowhere, back in a known working config, the server crashed again. So far I have replaced all of the RAM with new sticks, and it has passed multiple 14 hour memtests, I have replaced the USB with a new one, and have checked all connectors in the machine. I have reseated my HBA card. I am running out of things to try to replace. After every crash, i can reboot the system and be back to a "working" machine no problem. As of now I have reinstalled the GPU, and reinstalled the NVidia plugin as it seems to crash with or without it. I have syslog running, but cant really see anything in the syslog after a crash. I have taken a few pictures of the screen output during a crash, but am unsure if that is helpful at all. Ill attach everything here, but as of now i'm not sure what to even do going forward. syslog.txt eos-diagnostics-20230326-1026.zip
  4. welp we crashed again with the usb in a usb 3 slot... time to add a pci usb card... this is REALLY getting old... im ordering a new motherboard...
  5. The diagnostics after the reboot: eos-diagnostics-20230324-1921.zip
  6. Okay, so... I had another crash, this time I had a crash where the syslog was being mirrored to flash. In that syslog, at the bottom, you can see that the USB is getting reset over and over, so it seems like maybe those USB ports on my motherboard are bad? The flash drive i just replaced with a new one as I figured this might be the case. Is that what others seen to find in this log as well? I really hope this is actually the issue... syslog.txt
  7. Welp, my server crashed again overnight after making the change to libtorrent v1... oof. my new flash drive is here so im gonna replace that and do a fresh install of unraid, just dragging over my config folder... see if that fixes this. im so tired of crashing...
  8. Sooooo i may have found what it might be.. https://forums.unraid.net/bug-reports/stable-releases/crashes-since-updating-to-v611x-for-qbittorrent-and-deluge-users-r2153/page/6/?tab=comments#comment-21671 this bug here with libtorrent v2 in qBittorrent and Deluge is scarily close to the issues that I have been having. it explains the randomness of the crashes, and the logs people have been showing seem to be very similar to the snippets of the logs that I have been having I did have the server crash while (at the time, i have unmanic now) I had tdarr off without transcoding. but this whole time I have had my qBittorrent to auto start on server boot, so it would explain all of that. I am still in hardcore troubleshooting mode, and if this does in fact fix the problem then I will come back and post my entire journey here for others as well. Im also gonna still put the small fan on my HBA because i already made the amazon order for the supplies 😛 Also, i just wanna say, you guys here have been so helpful. this most likely isnt even an issue with your software, and you are all doing everything you can to help, and i am so thankful for that. so, thanks.
  9. I had another crash today, same looking log on the screen, but no i hadnt thought about the hba getting too hot... it hasnt been a problem before but then again when im transcoding i wonder if thats actually the problem.... i wonder if there is a way to monitor that? Edit: Doesnt look like my 9201-8i has temperture monitoring, so I am just gonna follow a reddit guide on a good looking 40mm fan mod for it and repaste it. even if thats not the problem, it wont hurt to do.
  10. So this time before last crash I switched to unmanic per your recommendation, and I like it better. Tdarr isn't even installed... I also thought to myself, what has changed since I started getting crashes..., Then I realized that I moved my hba down a 16x slot to install the GPU, and that meant that the hba was only running at x4 speed. It "shouldn't" matter, but some people online were saying it might. So last night I swapped the GPU and hba on the motherboard. I don't care if the GPU is at a x4 slot as transcoding isn't any slower, and the hba can have a full x16 bandwidth slot. We are good without crashes overnight for now, but I do still have all my monitoring in place to keep an eye on it. I also ordered a new flash drive off the recommended lose from limetech, should be here soon just in case that's my issue. It's time for a flash drive change anyway this one is pretty old.
  11. Also, when i rebooted this time my flash drive was not detected... Im backing it up now on a windows machine, but I wonder if somthing is just wrong with my flash drive
  12. Okay, so I finally got a crash to happen while I had a screen attached and this is the output that I can see: Obviously this isnt the whole crash, but does this give any info into whats going on?
  13. How possible is it that it's the GPU statistics plugin causing this crash? That's the only thing that has been new since these crashes have been happening... Also, is it possible that Tdarr is causing the crash?
  14. Okay, so last night I set the pcie gen to 3 instead of auto, reseated the GPU in the slot, and reseated all the memory. It passed a 13 hour memory test with 0 errors, and seems to be okay. When I woke up again this morning I saw that it had hard crashed again... The screen went to sleep so when I get home from work today I'll set the screen to not sleep like you said and see if the console can show me what's going on. I assume the command I want to run is the "tail -f /var/log/syslog" command yeah? But that would seem to just show me what the syslog shows, and the syslog didn't have anything last time.. Or is there a command that will show me a more verbose output? I wouldn't be opposed to having the most info possible if that's a thing. What command should I be running to see the console output? I also set the docker network to IPVLAN and restarted docker, but didn't reboot after that change. Should I have rebooted? On the plus side the GPU is showing at a x16 lane now so that's progress I guess... Still crashing though.
  15. Okay, so i have a screen attached now, i can see the login prompt, i assume that I just login to the non gui on the screen then just leave it? Or do I have to have it show me console output somehow?