Jump to content

GatorMB

Members
  • Posts

    20
  • Joined

  • Last visited

About GatorMB

  • Birthday October 25

Converted

  • Gender
    Male
  • Location
    Canada

GatorMB's Achievements

Noob

Noob (1/14)

1

Reputation

  1. Good morning! Here is the update... Server is 100% stable. It has not crashed at all since the cpu was swapped out. However, I did find another issue. The Nvidia Tesla P4 GPU gets super hot when transcoding 3 files in tdarr. I have disabled the tdarr container and it runs stable. I have ordered a fan for the GPU to help cool it. More to follow on the success of that! Again, huge thanks to JorgeB & JonathanM for all your help!
  2. For some reason the GPU card shows when I boot, and then it falls off the system. I caan't see the card details in the nVidia driver section, and I can't see. it in the GPU Statistics plugin. Am I doing something wrong, or is it a bad gpu? What is another alternate GPU you would recomment that won't break the bank?
  3. System is stable. It was 100% a bad processor. Thanks everyone for all your help!
  4. New processor arrived today. It’s now in and I’m up and running. I will report back within 72 hours if it’s stable. Fingers crossed!
  5. I do also notice that I get a failure on the GPU plugin in Unraid on occasion. Do I need to disable the onboard video now that I am running the Tesla P4?Or do I have a bad GPU as well?
  6. I'm starting to believe you are correct. I initially ran this box with a Supermicro x9scl mobo and cpu and it ran perfect. It just couldn't handle transcoding and the mobo didn't have a slot for a gpu card. I upgraded the mobo to the Supermicro x11ssh-f and the cpu to the xeon e3-1285v6. Ram went from 32 to 64. All drives, psu, cooling, case stayed thee same. I started having failures. I changed the ram and still same issue. I changed the mobo and same issue. I changed the psu from 600 80+ white to 850 80+ gold. I added liquid cooling. I added a Tesla P4. Nothing has eliminated the problem. The only thing left is the CPU. I am waiting for a xeon e3-1270v6 to arrive in a few days. I'll swap it out and see if that helps. If not then I'm at a total loss as to what could be causing it! Could it be bios related? I have BMC connected, but don't have the password, so I will need to reset it via the jumper? Then I can review it on a remote pc. I have link aggregation connected from the mobo to my ASUS GT-AC5300 router. I literally have no idea what else to try?! I have the syslog going to root on the flash drive, but nothing seems to stand out to you or others... Do you think it could just be a bad CPU?!
  7. Ok, so I went away to the lake yesterday am and left the server running. I got back an hour ago and it's all locked up again. Not showing on the network either. Did a hard reset and it came up fine. Here are the logs. I can't figure this out! Someone please point me in the right direction! syslog syslog-previous
  8. No, I don't leave any active connections. It's a headless server and I only log in to run a process or to try to figure out an use such as this. I'm going to enable the IPMI function and connect that lan port for diagnostics later this weekend.
  9. So yesterday I woke up to an unresponsive server again. No network connectivity, nothing. So, I decided to try 2 more things. I changed the PSU from a 600w to a new corsair 850 80+ Gold. I then added a corsair water cooler for the cpu. Again this am, I woke to a non-responsive server. But this time it was still showing on the router as connected. If it crashes over the weekend, I'll upload a new set of logs. goldraid-diagnostics-20240216-2022.zip syslog-2.txt
  10. I will after this learning experience! lol It’s running fine now. I’ll post in the morning and let you know if it crashed again. Thanks again for all your help!
  11. Ok, then how would I reset it to that?
  12. I went to the bash command and made sure all appdata / domains / system shares were moved to the cache using: rsync -av --remove-source-files /mnt/disk2/appdata/ /mnt/cache/appdata/ It moved all files. I then removed all empty folders left behind: find /mnt/disk2/appdata/ -type d -empty -delete I corrected appdata folder permissions: chmod -R 755 /mnt/cache/appdata/ chown -R nobody:users /mnt/cache/appdata/ I made sure that the appdata / domains / system shares were all now cache only in my shares menu. I made sure that the data & iCloud-drive-sync shares were all now pointing to array in my shares menu. I reloaded the nvidia driver, and installed the gpu statistics plug-in. I removed changed the macvlan to ipvlan. I have checked the cache pool and It's still pretty full (I think). I have a 2tb ssd and a 256GB ssd. Do I need a larger cache pool? What am I missing? And before I forget, huge thanks to trurl & JorgeB for all your help. I really appreciate the time you are taking!
  13. NETWORK ID NAME DRIVER SCOPE 400cca84e0d2 br0 ipvlan local 06eb05e8c95b bridge bridge local 74b5ac950c5a host host local 8c0d0933753b none null local
  14. So, it crashed again. Here is the syslog info. syslog-previous.txt syslog.txt
×
×
  • Create New...