• Posts

  • Joined

  • Last visited

Everything posted by rodan5150

  1. Now that I run my docker containers that need their own IP on a dedicated "docker" NIC, no issues at all on 6.9. It has been great for months.
  2. I assume you are doing a file transfer and the speed is dropping off? I do not have Mellanox cards, but I do have the same Mikrotik switch. I have no issues sustained reads/writes, as long as the share is set to use cache, which are both NVME as in your setup. For me to get max speed, I have to use an MTU of 9000 for both machines. The fact that your speed is dropping after a period of time, tells me some sort of buffer is being filled up, would be my guess. Have you done testing with iperf3? You could run a test that is longer than 30s or whatever the time your file transfer drops off. That would prove out that its not the NICs or the switch, since it would run entirely from ram. If you aren't familiar with iperf3, the nerdtools plugin for unraid has iperf3 as one of the included packages. Once you have it installed, you'd run "iperf3 -s" on your unraid box. You also need iperf3 for windows of course, then you'd do something like "iperf3 -c <ip of unraid box> -t 60 -P 10" for 60 seconds and 10 streams. Adjust accordingly. To test the other direction add a "-R" to the end, which will just reverse the flow so that your windows box becomes the server. Other things to be sure of are cabling and infrastructure. Are you using copper, fiber, DAC cables? If copper, what type? CAT6, CAT6A? how long are the runs? Lots of factors could come into play. I always start with iperf though. If it tells me I'm good over a particular link, and then I go to do a test file transfer and it is much slower usually a misconfig (not using a share that is assigned to the NVME) or a disk performance issue (transferring a ton of small files like a Plex database kills performance, even on NVME drives). Hope this helps.
  3. I'm worried about this, but I'm trying to stay hopeful the second NIC will be the band-aid for me for now. Uptime is almost 14 days, and no issues to speak of thus far, knock on wood...
  4. Going the vlan route did not work for me. I ended up having a call trace and subsequent kernel panic after a number of days. So far, creating a second interface (br2 in my case) with the unused onboard NIC seems to be the "fix" for me at this point. I don't want to muddy the waters anymore than they have to be, but since it could be something external of Unraid on the network (multicast, who knows...) that Unraid is choking on and causing the issue, perhaps it is beneficial to also mention a brief summary of the network gear used? Maybe a common denominator will surface that can be a hypothesis generation device. Perhaps wireshark could be utilized to help troubleshoot as well? Just throwing up ideas at this point. For reference, my offending Unraid system: Ryzen system, 3600. B450 chipset. br0 is Intel X520-T2 10Gbe adapter via DAC to Mikrotik CRS305-1G-4S+IN 10Gbe, uplink to "core" Unifi switch. br2 is Intel I211-AT copper to "core" 1Gbe Unifi switch.
  5. In my experience, 9000 (9014 is what I have set, per Intel driver in my Windows box, same card in my Unraid server, so I set it identically) MTU is necessary for 10Gbe to see full bandwidth. Though, IIRC, I saw much faster than 1Gbe with 1500 MTU set. edit: For reference, I have same 10GBe switch as you as well, and Unraid server is on DAC cable to switch. Windows box is over copper CAT6A in the walls, close to 20M+ in length, and I get at or near theoretical max for 10Gbe both directions. NICs are Intel X540 T2's in both machines.
  6. I was close to doing this, but I figured I'd tough it out and give 6.9.x a shot. So far, the br2 network for my containers I want to have their own IP, has been working well. No call traces yet and certainly no kernel panics. Of course, it is barely over a week out since I made that change. If I can say this a month out, then I will consider it good to go.
  7. Yeah, I reverted the change of the C-states back to default. Only thing I have set now is the power supply idle control. So far, what has me "fixed" is I've move all of my docker containers that needed a custom network (static IP) over to a separate NIC (br2). I also disabled vlans in Unraid as well, since I wasn't using them anymore. No kernel panics or anything, yet anyway. It's been over a week now. Fingers are crossed!
  8. thanks for the Reply JorgeB. I'm going to give the second NIC assignment a shot. I had been trying to do all of this through a single 10Gbe connection. I've got several 1Gbe ports open on my main switch, so its not a huge deal to just assign the containers to a second NIC. With any luck, this will solve it.
  9. Bad news. So kernel panic is back. I thought I had figured it out by moving everything from br0 to br0.x but it looks like there has to be another issue going on, causing the call traces that ultimately end in a kernel panic. What else could I be missing? Darkhelmet syslog 5-21.txt
  10. I looked over it, I'm no expert, but you do have quite a few errors and warnings. Not sure what is critical or would cause hangups/crashes. Anyway, I took your CSV file, sorted it in descending orde r by dat e and time stamp, then exported as tab delimited txt file. Maybe this will help others to interpret it better. All_2021-4-28-10_7_11_tab
  11. The PassMark one is the one I've used to test ECC Memory. Be sure to boot to UEFI on the Memtest boot stick, not the traditional Memtest86 with the blue screen that is Legacy/BIOS boot. It will have both. The UEFI one is the one that did the trick for me testing ECC, i think the legacy one does not.
  12. That's exactly where it was. I enabled the C-states option, and then set the idle current to typical instead of low. I've also created a Docker specific vlan, and moved all of the br0 over to br0.x so hopefully that will keep my call traces and kernel panics at bay. I will update if anything changes. So far so good, but it has only been about 18 hours or so. Longest it has gone in the past was 10ish days. So if I can hit 2 weeks+ I'll consider it a win.
  13. Awesome, thanks for letting me know. I will revert the global C-state setting, and dig around and see if I can find the idle current setting.
  14. Hello all, I've been fighting a few issues since I "upgraded" to a Ryzen based system from an old dual Xeon Dell T410. New build is a Ryzen 3600, on a B450 chipset. It was getting unresponsive after a few days, and that seems to have gone away after disabling global C-states in the BIOS. The latest issue is I get a kernel panic after a week or so of uptime. I have syslog enabled, and was able to get it just before the panic. I also got a pic of the screen before I rebooted that tells the rest of the story after the syslog dropped. Looks like network related, maybe mac-vlan? Definitely a call trace happens, but not sure what can be done about it. thanks for any help in advance! Ross syslog 4-20-21.txt
  15. I had a very similar situation this morning. I pulled the plug today for replacing a "bad" flash disk, 32GB Usb 2.0 SanDisk Cruzer Fit. I replaced an old/cheap drive back in June of last year, that was actually bad. So I had to go through support to get reinstated. So not as hassle free, but they were pretty quick to respond and get me fixed up! Anyway, I was getting read errors: I thought for sure the flash drive was dead, because when I plugged it into my Windows machine to yank the config file at least, but it hosed it up proper. Windows explorer crashed, I couldn't get into disk management, nothing I tried would allow me to see/access the drive. I called it, and went ahead with replacing it with a backup Cruzer I have still NIB. Then a while later, I got curious, and fiddled with it more. Obviously the flash drive had a chance to cool down and such. But its been solid read/write/verify testing for nearly 5 hours now without a single error. So, looks like I may have prematurely replaced it, but not sure why it hosed up so bad to begin with. Server is going good now, no more errors. Just thought I'd tell my story as well, just in case someone else has similar errors pop up. It may not be the flash drive. -Ross
  16. I just installed the plugin for the first time. It will not let me sign in, I just get the perpetual spectra wave of death, regardless of browser or pop-up blocker settings. If it is down for others, that may explain it. I still have access to my server, so no big deal.
  17. 10-4. I'll give that a shot and report back. Disk6 in coincidentally right next to the slot where I replaced the parity. So, there is a decent chance I disturbed the cable. Its almost done with a SMART extended test. If it passes that no issues, then I'll have higher confidence in it. Thanks JorgeB.
  18. Update: I read about the "New Config" utility, but makes me nervous. I want to make sure I'm going to proceed correctly. I assume I replace "Failed/untrusted" drive with new pre-cleared drive, and using new config tool, assign it in the same position in the array. Start the array, and let it rebuild? But what about parity? I don't know how valid the parity is on the original 10tb parity disk. The array has been started and online without it assigned in the parity slot, so undoubtedly it's not synced. If I've lost the data on that 4tb drive, its not that big of a deal as anything critical is backed up off-site. But I don't want to nuke the array on accident and lose all of my media, as that would take forever to replace. I don't want to do anything until an expert chimes in, I've used Unraid for a couple of years now, but I've never dealt with this particular situation before and want to tread lightly.
  19. Hey guys, So, I was upgrading my parity drive from 10tb to 14tb, and during the parity rebuild, a 4tb data drive threw up IO errors and was disabled by Unraid. It passes SMART tests, but who knows. Maybe it was cable? but it is an older drive, so I'd rather not chance it. Where I may have messed up is, I stopped the parity rebuild as soon as this happened, in sort of a panic, and added the old parity drive back in the array, and attempted to reassign it as parity again thinking I could revert back in a way, and just have a bad data disk to replace with a new one (10tb) and THEN replace the parity like originally planned. But now, Unraid is not allowing me to start the array with the old 10tb parity drive as parity and keeping the disabled drive in the array. I also does not recognize the 14tb as parity drive either, and it shouldn't because it was only a few percent into a parity rebuild. So, currently the array is started, but has no parity disk assigned and the disabled drive is being "emulated" which I'm not sure how that's possible without parity. So, what are my next steps to recover? I've got an additional 10tb drive on the way to replace the now failing (I presume) 4tb drive. thanks for any help in advance! Ross
  20. No worries. If you want budget, then I'd get a GTX 1050 or a Quadro P400 for Plex transcoding. I run a P400 I got off e-bay for like $70 over a year ago that works very well, no external power required.
  21. I don't believe a GT730 will work. Not listed on the nvidia NVENC support matrix.
  22. figured so, thanks for the reply Squid!
  23. Was Troubleshooting Mode removed? I'm not seeing where to activate it. Of course, I may have missed something! edit: I did find a "mirror syslog to flash" option under Syslog Server, Did this replace troubleshooting mode? I think this will probobaly do what I need. I'm currently chasing random lockups/partial crashes where I lose network access and am having issues with it writing diagnostics to flash via console commands.
  24. Your problem may be due to the /transcode path not being something like /tmp if you wanted to transcode to ram or /tmp/PlexRamScratch if you use the size limited ram disk script as posted earlier in the thread. Since /transcode does not exist as a an actual directory in your setup most likely, I believe Plex is defaulting to transcoding within the docker image and that is why it is filling up.