Pri

Members
  • Posts

    63
  • Joined

1 Follower

Recent Profile Visitors

835 profile views

Pri's Achievements

Rookie

Rookie (2/14)

31

Reputation

1

Community Answers

  1. The only thing I would have changed is I'd have gone with 4x3.84TB or 4x7.68TB U.2/U.3 drives. And the reason being, I need more flash capacity. I have Mover set to move stuff quite quickly at the moment because well: I'm like this constantly. There is terabytes more data I want held in the cache for performance reasons and I'm often at the limit. I think I was at 97% the other day. These 2TB SSD's are great but I need more capacity. I still have a x16 slot unpopulated and two of these connectors on the board each of which can take 2 x U.2 or M.2 drives (with adapters). So I could potentially add 8 x U.2 drives total though I'm more likely to just buy 4. So that's probably what I'll change up next. The 980 Pro's are fast. I've managed to hit 18GB/s with kernel updates and other changes so they're not at all slouches, love the performance and the temperature characteristics but the capacity, 2TB per drive just wasn't enough. Endurance wise I'm not concerned as much, each drive is rated to 1.2PB so with 4 in RAID0 like I have now thats 4.8PB of write endurance. I think I've used 2% on each drive since I isntalled them just over a year ago. But capacity, I need more and I think U.2 enterprise drives are the way to go, the cost per TB is so much lower as you pointed out. Apart from that I don't think I'd change anything to be honest. I love the CPU and Motherboard, the RAM quantity was perfect for what I do, I'm happy with the 2 x 2TB SN850's which I'm using for VM's and Dockers in RAID1. Based on what I just said here I'd probably advise you to also go with U.2's if you need the capacity, consistent performance acros the whole drive or the high endurance.
  2. I considered the same thing, to wait and get an RTX 4000 Ada 20GB instead or even buy one brand new. But when I looked at the situation codec wise. Everything I get 4K wise is H.265 whether it's 4K HDR or 4K SDR. Stuff I get that isn't 4K is H.264. There's no source of media I use that has AV1 files but if they did, the RTX A4000 can decode AV1, it just can't encode in it. But all the clients I need to transcode for either support H.264 or H.265. So it felt like a good fit. I think if I were to buy a newer card I'd want one with VVC (H.266) support which likely wont come until the next generation or even the one after that etc - That appears to be the codec that streamers will be settling on and the one to go mainstream, it's looking like AV1 will not make it big and in-fact some TV's at CES this year lacked AV1 support but had VVC support. Something to think about!
  3. I've done a whole bunch of updates to the server since the initial build which I think would be good to share. Most recently I added an RTX A4000 for PLEX transcoding and AI/LLM use. In the initial build I did use a GTX 1080 Ti but its Pascal and a bit power hungry and lacks newer codec support that I needed. I removed the 1080 Ti a long time ago and just recently replaced it with the RTX A4000. Main reason I chose this card is it's single-slot and so I'd only use up one slot instead of two. I still have one left unused that I may use for something in the future. Below are two photos of the card, I think it looks really quite beautiful. I've found the card idles around 10 Watts with the appropriate idle power mode script applied, the fan is not audible to me and stays around 40% in my chassis. In addition to adding this card I've also made some other system changes. I removed the Supermicro CPU cooler because I found it quite loud when the CPU got warm and I've since swapped it for an Arctic 4U-M which is basically inaudible. The Supermicro had a 92mm fan which I found idled around 2,500 RPM to keep my chip cool and would go as high as 3,500 under load. The newer Arctic 4U-M cooler features two 120mm fans and runs much quieter with them spinning around 950 RPM at idle and 1200 RPM under load. Quite remarkable difference in not only temperatures but noise. I also switched out my Intel X540-T2 for an X550-T2 because I needed the 2.5GbE capability that the X550 features for a cable modem I'm using while I needed its other 10GbE port for a Fibre modem from my second service provider. The X540 only features 1Gb/10Gb capability so it needed to be swapped out. I also swapped out the generic PCIe to M.2 card I was using for the 4 x 980 Pro SSD's because it wasn't stable. After about 6 months I had issues with one or more SSD's randomly disconnecting from the system which I believe was down to the weak VRM situation on the PCIe card. For this reason I switched to an Asus HyperM.2 PCIe 4.0 card which has a much stronger VRM setup, since I made that switch about half a year ago I've had no more issues with my 980 Pros. Another addition to the server was an Intel X710-DA4 card. This is a four port SFP+ (10Gb) card. I'm using this for the LAN side of my pfSense Virtual Machine, unRAID itself and a Windows VM. My motherboard does have dual 10GbE built in but I just prefer SFP+ and so I've stopped using those ports for the time being (but I may use them for VM's in the future that I want to have dedicated connectivity). Below is a photo of the current internal state of the server with all the changes I mentioned above. That's the update for right now. The system is still humming along dutifully.
  4. Yeah it looks like that, but I've completely changed the USB port, the USB stick etc - It seems to be a symptom of the problem in that it has some memory allocation issue and then that has this cascade of issues where it shows that USB error as part of it etc I checked my old USB key, did all kinds of diagnostics on it, it seems to be perfectly fine. I'm at a loss to explain what is going on really etc Regarding safe mode, is there any downsides to using that and is there anything you want me to provide once I'm running in safe mode?
  5. Someone suggested I run memtest86, so I have done that. After two hours it passed, no errors of any kind. I am allowing it to continue to run for the next 10 hours or so just to be sure though. I will add this to an edit in my above post aswell. EDIT: It kept passing for 7 hours and 20 minutes before I turned it off as I needed the server up to do some work etc
  6. The problem: For about a month now since I upgraded to the 6.12.x branch (.3 and then .4 only, I did not use .0 .1 or .2) I've had my server randomly start half-crashing every 1 to 16 days. Sometimes two times a day, sometimes only once a week, it seems random. I did not have any issues like this on 6.11, it was stable as a rock. What I mean by half-crashing: And by half-crashing what I mean is: my disk shares stop working, one of my Windows Server 2022 virtual machines crashes (the others do not), the WebUI becomes very slow to respond but does function, all my disks (NVMe SSD's and SATA Hard Drives) enter a sleep state and won't wake up etc - Only a reboot fixes it. So far what I've tried: 1. Brand new USB stick to boot from. Some people previously said based on my log it looked like my USB stick was failing. 2. Disabling C-States in the BIOS 3. Uninstalling the Unassigned Devices plugin (some thought it could be related to this as others on reddit are having the same crashes as I am as per this thread: https://www.reddit.com/r/unRAID/comments/16yz0gd/possible_fix_for_people_crashing_on_612/ but it appears to be unrelated). 4. Ran memtest86, it passed after 2 hours, I let it run for a total of 7 hours and 20 minutes without errors. Specs of my server / other hardware related details: 3rd Gen EPYC Milan with 7 NVMe devices, 11 Hard Disk Drives. The HBA I'm using is a 9500-8i. I do not have a dedicated GPU installed, I'm not doing any kind of fancy graphics stuff with my VM's either. All the firmware and BIOS are up to date on all my stuff (Motherboard, HBA, Network cards, NVMe drives etc). Diagnostics taken today during the half-crashing it did hyepyc-diagnostics-20231021-1828.zip I did make a thread about this earlier, I marked it as resolved because everyone on discord told me: get a new USB key your current one seems broken. So that is what I did, however that hasn't resolved the issue so I'm making the topic again with a fresh diagnostics. The only real difference between then and now is I do have the new USB key in, I did uninstalled the unassigned devices plugin before this most recent crash and so forth. Any help is greatly appreciated. Also you may see in my log a lot of SSH logins, this is from external software I run to automate Docker since unRAID doesn't have an official API. I have since disabled this in-case there is some kind of SSH-related out of memory bug in play etc
  7. The problem happened again today even while Unassigned Devices and Unassigned Devices Plus were completely uninstalled. So the mystery continues and I shall make a thread for my issue with the latest diagnostics I took for todays lockup. Thank you for your help
  8. There is software which logs in to run commands for docker. When the server is working properly it only needs to login very seldom. So that isn't me, it's automated software doing that. All the disks have 100GB or more free, they're not at 100%, they're high capacity disks. I've uninstalled the UD plugin for the time being to see if the crash still occurs because I too have no indication that it is UD, I'm just going by what the guy said on reddit and trying it. When he said, reinstall UD and it would fix it, for me I was getting these lockups daily until I reinstalled UD. At that point I had 16 days of uptime without issues until today. So now I've uninstalled UD entirely and I'll leave it like that for at least 2 months or until I get a crash again which would rule out UD as the culprit entirely if that happens.
  9. https://forums.unraid.net/topic/92462-unassigned-devices-managing-disk-drives-and-remote-shares-outside-of-the-unraid-array/page/399/#comment-1312939
  10. Sorry this took so long, I had to wait for the problem to occur again. Which it just did:
  11. Some of us are having lockups with out of memory issues, one user thinks that he has found the culprit and that it's the unassigned devices plugin. But there's nothing to discern whether that really is the case. Here is the thread on reddit about it incase you want to take a look @dlandon (I'm having the same Out of memory lockups as those in the thread describe and I am using Unassigned Devices with a USB based external hard drive but I have no idea if this issue is correlated or not).
  12. Changed my USB stick yesterday and so far so good, no more errors, no more lockups or crashes. Thank you to the people on the unRAID discord who believed the issue was related to the USB key as I postulated in my initial post
  13. unRAID Version in use: 6.12.3 (first crash) 6.12.4 (second and third crash). So I've had this issue three times recently (within the past month) where one of three VM's (Windows 2022 Server x 2 + pfSens) I'm running on the system crashes or locks up. Then when I access the unRAID UI, the "main" tab and the "vms" tab just hang and don't load. A reboot fixed it the first time so I thought it was a fluke. About a month later, it happens again, I reboot again, this time this only fixed it for about 3 hours then it happened a 3rd time. I have included the log below as I have the syslog enabled. This covers the most recent lockup and shutdown-bootup sequence. I have logs going back further all the way to August if that's helpful. Looking at the log myself, it seems that maybe it's the USB stick I'm using that is failing but I cannot be certain as I don't know if this kind of error related to the USB stick is a symptom of something else so I'll leave that determination to your experience. Thank you for any assistance syslog-hyepyc.log
  14. Set the fan speed in the motherboards IPMI menus to 100% then use the IPMI Tools plugin in unRAID to set the RPM. If you set the motherboards IPMI fan system to anything other than full speed / 100%, it will cause conflicts with the unRAID IPMI plugin.
  15. Yes you can access the IPMI KVM from the WebUI. That will allow you to enter the BIOS remotely over network etc No video card is necessary as there is one included in the BMC chip of the motherboard.