Jump to content

Pri

Members
  • Posts

    63
  • Joined

Everything posted by Pri

  1. The only thing I would have changed is I'd have gone with 4x3.84TB or 4x7.68TB U.2/U.3 drives. And the reason being, I need more flash capacity. I have Mover set to move stuff quite quickly at the moment because well: I'm like this constantly. There is terabytes more data I want held in the cache for performance reasons and I'm often at the limit. I think I was at 97% the other day. These 2TB SSD's are great but I need more capacity. I still have a x16 slot unpopulated and two of these connectors on the board each of which can take 2 x U.2 or M.2 drives (with adapters). So I could potentially add 8 x U.2 drives total though I'm more likely to just buy 4. So that's probably what I'll change up next. The 980 Pro's are fast. I've managed to hit 18GB/s with kernel updates and other changes so they're not at all slouches, love the performance and the temperature characteristics but the capacity, 2TB per drive just wasn't enough. Endurance wise I'm not concerned as much, each drive is rated to 1.2PB so with 4 in RAID0 like I have now thats 4.8PB of write endurance. I think I've used 2% on each drive since I isntalled them just over a year ago. But capacity, I need more and I think U.2 enterprise drives are the way to go, the cost per TB is so much lower as you pointed out. Apart from that I don't think I'd change anything to be honest. I love the CPU and Motherboard, the RAM quantity was perfect for what I do, I'm happy with the 2 x 2TB SN850's which I'm using for VM's and Dockers in RAID1. Based on what I just said here I'd probably advise you to also go with U.2's if you need the capacity, consistent performance acros the whole drive or the high endurance.
  2. I considered the same thing, to wait and get an RTX 4000 Ada 20GB instead or even buy one brand new. But when I looked at the situation codec wise. Everything I get 4K wise is H.265 whether it's 4K HDR or 4K SDR. Stuff I get that isn't 4K is H.264. There's no source of media I use that has AV1 files but if they did, the RTX A4000 can decode AV1, it just can't encode in it. But all the clients I need to transcode for either support H.264 or H.265. So it felt like a good fit. I think if I were to buy a newer card I'd want one with VVC (H.266) support which likely wont come until the next generation or even the one after that etc - That appears to be the codec that streamers will be settling on and the one to go mainstream, it's looking like AV1 will not make it big and in-fact some TV's at CES this year lacked AV1 support but had VVC support. Something to think about!
  3. I've done a whole bunch of updates to the server since the initial build which I think would be good to share. Most recently I added an RTX A4000 for PLEX transcoding and AI/LLM use. In the initial build I did use a GTX 1080 Ti but its Pascal and a bit power hungry and lacks newer codec support that I needed. I removed the 1080 Ti a long time ago and just recently replaced it with the RTX A4000. Main reason I chose this card is it's single-slot and so I'd only use up one slot instead of two. I still have one left unused that I may use for something in the future. Below are two photos of the card, I think it looks really quite beautiful. I've found the card idles around 10 Watts with the appropriate idle power mode script applied, the fan is not audible to me and stays around 40% in my chassis. In addition to adding this card I've also made some other system changes. I removed the Supermicro CPU cooler because I found it quite loud when the CPU got warm and I've since swapped it for an Arctic 4U-M which is basically inaudible. The Supermicro had a 92mm fan which I found idled around 2,500 RPM to keep my chip cool and would go as high as 3,500 under load. The newer Arctic 4U-M cooler features two 120mm fans and runs much quieter with them spinning around 950 RPM at idle and 1200 RPM under load. Quite remarkable difference in not only temperatures but noise. I also switched out my Intel X540-T2 for an X550-T2 because I needed the 2.5GbE capability that the X550 features for a cable modem I'm using while I needed its other 10GbE port for a Fibre modem from my second service provider. The X540 only features 1Gb/10Gb capability so it needed to be swapped out. I also swapped out the generic PCIe to M.2 card I was using for the 4 x 980 Pro SSD's because it wasn't stable. After about 6 months I had issues with one or more SSD's randomly disconnecting from the system which I believe was down to the weak VRM situation on the PCIe card. For this reason I switched to an Asus HyperM.2 PCIe 4.0 card which has a much stronger VRM setup, since I made that switch about half a year ago I've had no more issues with my 980 Pros. Another addition to the server was an Intel X710-DA4 card. This is a four port SFP+ (10Gb) card. I'm using this for the LAN side of my pfSense Virtual Machine, unRAID itself and a Windows VM. My motherboard does have dual 10GbE built in but I just prefer SFP+ and so I've stopped using those ports for the time being (but I may use them for VM's in the future that I want to have dedicated connectivity). Below is a photo of the current internal state of the server with all the changes I mentioned above. That's the update for right now. The system is still humming along dutifully.
  4. Yeah it looks like that, but I've completely changed the USB port, the USB stick etc - It seems to be a symptom of the problem in that it has some memory allocation issue and then that has this cascade of issues where it shows that USB error as part of it etc I checked my old USB key, did all kinds of diagnostics on it, it seems to be perfectly fine. I'm at a loss to explain what is going on really etc Regarding safe mode, is there any downsides to using that and is there anything you want me to provide once I'm running in safe mode?
  5. Someone suggested I run memtest86, so I have done that. After two hours it passed, no errors of any kind. I am allowing it to continue to run for the next 10 hours or so just to be sure though. I will add this to an edit in my above post aswell. EDIT: It kept passing for 7 hours and 20 minutes before I turned it off as I needed the server up to do some work etc
  6. The problem: For about a month now since I upgraded to the 6.12.x branch (.3 and then .4 only, I did not use .0 .1 or .2) I've had my server randomly start half-crashing every 1 to 16 days. Sometimes two times a day, sometimes only once a week, it seems random. I did not have any issues like this on 6.11, it was stable as a rock. What I mean by half-crashing: And by half-crashing what I mean is: my disk shares stop working, one of my Windows Server 2022 virtual machines crashes (the others do not), the WebUI becomes very slow to respond but does function, all my disks (NVMe SSD's and SATA Hard Drives) enter a sleep state and won't wake up etc - Only a reboot fixes it. So far what I've tried: 1. Brand new USB stick to boot from. Some people previously said based on my log it looked like my USB stick was failing. 2. Disabling C-States in the BIOS 3. Uninstalling the Unassigned Devices plugin (some thought it could be related to this as others on reddit are having the same crashes as I am as per this thread: https://www.reddit.com/r/unRAID/comments/16yz0gd/possible_fix_for_people_crashing_on_612/ but it appears to be unrelated). 4. Ran memtest86, it passed after 2 hours, I let it run for a total of 7 hours and 20 minutes without errors. Specs of my server / other hardware related details: 3rd Gen EPYC Milan with 7 NVMe devices, 11 Hard Disk Drives. The HBA I'm using is a 9500-8i. I do not have a dedicated GPU installed, I'm not doing any kind of fancy graphics stuff with my VM's either. All the firmware and BIOS are up to date on all my stuff (Motherboard, HBA, Network cards, NVMe drives etc). Diagnostics taken today during the half-crashing it did hyepyc-diagnostics-20231021-1828.zip I did make a thread about this earlier, I marked it as resolved because everyone on discord told me: get a new USB key your current one seems broken. So that is what I did, however that hasn't resolved the issue so I'm making the topic again with a fresh diagnostics. The only real difference between then and now is I do have the new USB key in, I did uninstalled the unassigned devices plugin before this most recent crash and so forth. Any help is greatly appreciated. Also you may see in my log a lot of SSH logins, this is from external software I run to automate Docker since unRAID doesn't have an official API. I have since disabled this in-case there is some kind of SSH-related out of memory bug in play etc
  7. The problem happened again today even while Unassigned Devices and Unassigned Devices Plus were completely uninstalled. So the mystery continues and I shall make a thread for my issue with the latest diagnostics I took for todays lockup. Thank you for your help
  8. There is software which logs in to run commands for docker. When the server is working properly it only needs to login very seldom. So that isn't me, it's automated software doing that. All the disks have 100GB or more free, they're not at 100%, they're high capacity disks. I've uninstalled the UD plugin for the time being to see if the crash still occurs because I too have no indication that it is UD, I'm just going by what the guy said on reddit and trying it. When he said, reinstall UD and it would fix it, for me I was getting these lockups daily until I reinstalled UD. At that point I had 16 days of uptime without issues until today. So now I've uninstalled UD entirely and I'll leave it like that for at least 2 months or until I get a crash again which would rule out UD as the culprit entirely if that happens.
  9. https://forums.unraid.net/topic/92462-unassigned-devices-managing-disk-drives-and-remote-shares-outside-of-the-unraid-array/page/399/#comment-1312939
  10. Sorry this took so long, I had to wait for the problem to occur again. Which it just did:
  11. Some of us are having lockups with out of memory issues, one user thinks that he has found the culprit and that it's the unassigned devices plugin. But there's nothing to discern whether that really is the case. Here is the thread on reddit about it incase you want to take a look @dlandon (I'm having the same Out of memory lockups as those in the thread describe and I am using Unassigned Devices with a USB based external hard drive but I have no idea if this issue is correlated or not).
  12. Changed my USB stick yesterday and so far so good, no more errors, no more lockups or crashes. Thank you to the people on the unRAID discord who believed the issue was related to the USB key as I postulated in my initial post
  13. unRAID Version in use: 6.12.3 (first crash) 6.12.4 (second and third crash). So I've had this issue three times recently (within the past month) where one of three VM's (Windows 2022 Server x 2 + pfSens) I'm running on the system crashes or locks up. Then when I access the unRAID UI, the "main" tab and the "vms" tab just hang and don't load. A reboot fixed it the first time so I thought it was a fluke. About a month later, it happens again, I reboot again, this time this only fixed it for about 3 hours then it happened a 3rd time. I have included the log below as I have the syslog enabled. This covers the most recent lockup and shutdown-bootup sequence. I have logs going back further all the way to August if that's helpful. Looking at the log myself, it seems that maybe it's the USB stick I'm using that is failing but I cannot be certain as I don't know if this kind of error related to the USB stick is a symptom of something else so I'll leave that determination to your experience. Thank you for any assistance syslog-hyepyc.log
  14. Set the fan speed in the motherboards IPMI menus to 100% then use the IPMI Tools plugin in unRAID to set the RPM. If you set the motherboards IPMI fan system to anything other than full speed / 100%, it will cause conflicts with the unRAID IPMI plugin.
  15. Yes you can access the IPMI KVM from the WebUI. That will allow you to enter the BIOS remotely over network etc No video card is necessary as there is one included in the BMC chip of the motherboard.
  16. Ah yes, you're 100% correct, the Supermicro motherboard has it inverted compared to the manual. I remember now having to think that one out aswell.
  17. To connect the front panel USB you will need to use a USB3 to USB2 adapter. I did buy one but didn't end up using it and just left the front disconnected. I'm unable to take the photo you requested as the server is all powered on and everything and I can't turn it off etc
  18. The 9500-8i I purchased is a IT mode HBA yes. It lacks the RAM for raid modes. You would have to buy a 9560 to get RAID functionality (RAID5 / RAID6 etc).
  19. I did try the powertop script months ago, it didn't have any effect for my system and by that I mean the power use never changed before or after. I think the lowest C state I saw was C2 but that's just going by my memory. I did notice though that the power use of the x86 cores are already very low, they can go sub 0.25 watts. But the I/O die which handles PCIe, Memory, Infinity Fabric to/from the x86 cores uses a lot of energy, I think the lowest I saw it was 19.5 Watts while the x86 cores were all sub 0.25 etc I suspect an Intel system would have better luck with testing what the 9500-8i can do sleep state wise.
  20. I see, so this is a limitation of Docker and not unRAID. The reason I did it this way (two bridges that go to the same network) is because of performance issues. If my VM's are hammering the bridge, dockers can lose connectivity. So I shifted them to different bridges and gave dedicated physical ports on my server to each bridge. I'm running all VM's on br1 and then docker is using br0 and now they don't cause contention issues with each other. I could instead create a bond but my VM's could just eat all of that too thus the separation etc This isn't really a problem since the VM manager of unRAID lets me select br0 or br1 for my VM's so as I said above I just used br1 for VM's but I thought it was a bug due to the discrepancy between what VM's let me select and what Docker lets me select. Thank you for the clarification
  21. I have them setup in my Docker but only br0 shows not br1 in the dropdown. But both of those work fine in VM's. This is what I see on my Docker settings page: This isn't a bug?
  22. To replicate the bug, if you have two bridges in your unRAID the Docker Network Type dropdown will only offer up the first one like so: When instead it should offer all the bridges. In my system I have both br0 and br1. I'm able to select either for Virtual Machines but not with Docker. This is I believe how it should look: Thank you.
  23. My case came with two brackets from Gooxi, they allowed for single or dual-redundant power supplies to be fitted. I also had a 3rd bracket from Xcase, the retailer I bought mine from which was an ATX bracket so that I could use a normal power supply and that was the one I installed. So to be clear, that ATX bracket is not stock, but the single and dual server power supply brackets are standard from Gooxi and included in the cases accessory kit which ships with the case.
  24. Usually only my completely idle cores are at 1.5GHz, active cores are between 3GHz and 3.8GHz. When there's low CPU utilisation and only say 3-4 threads are active I've seen them hit 4GHz. No issues with CPU utilisation here, frequencies quite high all the time, I'd say 3.2GHz is the most common frequency I see due to multithreaded loads. I'm using the OnDemand CPU scaler with Performance Boost enabled.
  25. Essentially yes. I've recently switched my setup up, I put in an Intel X550-T2 (1Gb/2.5Gb/5Gb/10Gb capable) instead of the X540-T2 (1Gb/10Gb capable) into my unRAID box and I've hooked up a single cable from that card to my Modem for WAN (2.5Gb/s). For the LAN side, I'm using a network bridge created by my unRAID box and I pass through a VirtIO nic to my pfSense virtual machine. If I need my dedicated pfSense box for some reason I can just plug the WAN cable from that to my modem and turn it on. I left its LAN cable hooked up to my switch even though its powered off etc I've found pfSense runs really well virtualised on this server and the performance exceeds my dedicated box (I have 1.2Gb home internet). And the energy usage is a lot lower than when I had a physical box. I'm saving almost 50 watts idle by virtualising pfSense as a result. Hey cbapel, I'm glad you found the thread useful! So I can answer your questions. Firstly, I have a friend who is passing their HBA through to unRAID from TrueNAS where he runs unRAID in a VM. He's using the same motherboard that you and I are using except he has the SKU with the built in HBA and that is the one he's passing through. For him it works 100%, so we could totally pass through a 9300, 9400, 9500 etc to unRAID from proxmox and that would function properly. As for the SlimSAS cable. It comes with 2 x MiniSAS on one side and 1 x SlimSAS on the other. The SlimSAS side goes into the HBA. So basically the SlimSAS is a PCIe x8 or 8 x SATA/SAS. And the backplane on my case is an expander backplane. It only needs a single cable connection to enable all 24 drives but it has two inwards connections to increase bandwidth. The 3rd connection on my backplane is an output intended to allow you to connect this backplane to another backplane for daisy-chaining. So in my case I have a single 24 slot Backplane at the front. But if I had a 36 bay case which is a 24 bay at the front and 12 bays at the rear a cable could be used to connect those two backplanes enabling all 36 drives to be used from a single or dual-port HBA. In my case the HBA I'm using has a single port that acts like dual-ports when connecting to MiniSAS backplanes due to the breakout cable (SlimSAS to 2xMiniSAS) that is purchasable. The cable I went with was from Broadcom directly. If your backplane is not an expander type then you would need to purchase a HBA that can provide at-least three MiniSAS for your three columns of drives. The 16i models (9300, 9400, 9500, 9600 etc) are popular choices for this. If you do decide to go with a 9500 or 9600 HBA that uses the new SlimSAS connector, please make sure you get the right cable as there are two different varieties, one for PCIe backplanes only and one for SATA/SAS backplanes. If you buy the official Broadcom cables the PCIe only cable is green while the SAS version is black, the two types of cables are not pin compatible even though they'll fit the same receptacles at both ends so it's important to make sure the SlimSAS to MiniSAS cable you purchase is the correct one for your backplane type (I believe in your case you want the pure SAS version).
×
×
  • Create New...