Jump to content

Pauven

Members
  • Content Count

    703
  • Joined

  • Last visited

  • Days Won

    7

Everything posted by Pauven

  1. Depending upon your motherboard's design, and which slot you have the card installed, the card may be communicating with the CPU through the southbridge chipset (PCH), which might be the bottleneck. Often the PCH has a smaller pipe to the CPU, which is shared with all southbridge devices, commonly including SATA ports. If the southbridge connection is the limiting factor, then even moving a drive from the H310 to a motherboard SATA port might not make any difference if the motherboard SATA port is also going through the southbridge. I followed the link you provided to your current Unraid server (hopefully it still is current), and downloaded the manual for your mainboard. I see that there is one x16 slot (electrically x8), and two x8 slots (one x8 and the other electrically x4): It looks like the two electrically x8 slots both connect directly to the CPU, so as long as you are using either of those, I think you would be okay. The electrically x4 slot, furthest from the CPU, connects to the PCH - you should not be using this one. Looking at the system block diagram, I see that all 6 SATA ports are connected through the PCH. If the PCH is the bottleneck, and if you have the H310 correctly installed in one of the two x8 PCIe slots connected to the CPU, then moving a drive from the H310 to the motherboard may actually further slow down speeds. In that case, you may want to try the opposite, and move an array drive from the motherboard to the H310, so that you have 8 drives connected directly to the CPU, and only 4 drives connected through the PCH. Lastly, the PCH connects to the CPU via a DMI v2.0 x4 link, which is good for 2GB/s. That should be more than sufficient for 4 array drives (I'm not counting your cache), but if you have the H310 installed in the PCH connected PCIe slot, then you have 11 drives going over this link. 11 drives * 130 MB/s * 1.36 overhead = 1945 MB/s, which is suspiciously close to the 2000 MB/s limit of the DMI connection between the PCH and the CPU.
  2. The speeds seem artificially low. My 3TB 5400 RPM constrained array can hit 140 MB/s, and your 4TB drives should be marginally faster. While 130 MB/s is close, I think you have a bottleneck somewhere. With 7 drives on your SAS 2008 controller, let's check and see if that could be the culprit. 7 * 130 * 1.36 (this is an easier version of the formula I detailed above) = 1237 MB/s going through your controller. PCIe 1.0 x8 and PCIe 2.0 x4 both support 2000 MB/s, and PCIe 1.0 x4 supports 1000 MB/s. None of that lines up with 1237 MB/s, so it doesn't seem like this is a PCIe bus related constraint. That doesn't rule out the SAS 2008 controller, though - maybe it is just slow... Perhaps you have something about your build that doesn't show up in the report. Expanders? Maybe when using all of your SATA ports on your motherboard (sdb, sdc, sdd, sde) you are hitting some kind of bus limit? 4 * 130 * 1.36 = 707 MB/s, which again doesn't really seem like a common bus limit. I think you should try @jbartlett's DiskSpeed testing tool. Other thoughts: You have one of those servers that doesn't seem to react to changing the Unraid disk tunables. Except in extreme edge cases, you get basically the same speed no matter what. On the repeated tests, most seem to be withing +/- 0.9 MB/s, which is a fairly large variation, and for that reason your fastest measured speed of 129.7 is essentially the same as anything else hitting 127+ MB/s. Also, on at least one repeated test (Pass 1_Low Test 2 @ Thresh 120 = 127.8, and Pass 2 Test 1 = 116.6), the speed variation was 11.2 MB/s, which is huge. Perhaps you had some process/PC accessing the array during one of those, bringing down the score. For that reason, I say pretty much every test result was an identical result, and you probably won't notice much of any difference between any values. There's certainly no harm in using the Fastest values, as the memory utilization is so low there's no reason for you to chase more efficiency. Keep in mind if you use jbartlett's DiskSpeed test and find the bottleneck, and you make changes to fix it, you would want to rerun UTT to see if the Fastest settings change.
  3. I just checked in NerdPack too, and looks like I have an update available: The version/name is a bit odd. On the slackware repository, the version is screen-4.6.2-x86_64-2.txz, but this one in NerdPack has an 's', 4.6.2s. Not sure if that means anything special...
  4. root@Tower:/boot/utt# screen -version Screen version 4.06.01 (GNU) 10-Jul-17 I found this on my server in the NerdPack packages folder, so definitely the 64-bit version. Since it is already downloaded by NerdPack, you could just install it from there. \\<servername>\flash\config\plugins\NerdPack\packages\6.6\screen-4.6.1s-x86_64-1.txz
  5. Any chance that the screen problem is because we are installing a 32-bit version on a 64-bit server? Anyone know? Here's a URL to the 64-bit version: https://mirrors.slackware.com/slackware/slackware64-current/slackware64/ap/screen-4.6.2-x86_64-2.txz
  6. What was the output from step 3) Install screen: upgradepkg --install-new screen-4.6.2-i586-2.txz ?
  7. Hmmm. Well, I guess the good news is that you are only getting a couple notifications instead of hundreds, so it seems like it is mostly working. I'm not sure how a couple are slipping through. The one at the end is actually not that surprising, as there can be a delay for Unraid to send out each parity check start/finished notification, and the UTT script might have already removed the block at the end of the script before that notification comes through, so it should almost be expected that the very last parity check finished notification slips through. But the one at the beginning has me stumped, since the block is put into place before any parity checks are started. I see you are on Unraid 6.7.x - perhaps something has changed related to notifications since Unraid 6.6.x. I did all my development on Unraid 6.6.6, and I refuse to use 6.7.x until the numerous SMB and SQLite issues have been resolved.
  8. That sounds artificially low. Agreed. Looking at your test results, I see a couple things. First, you have a mixture of drives: 8TB, 6TB and 4TB. This has a impact on max speeds. How? Imagine a foot race with world's fastest man, Olympic champion Usain Bolt, your local high school's 40m track champion, a 5-year-old boy, and a surprisingly agile 92-year-old grandmother. I know you're thinking Usain will win, but wait... All four runners are on the same team, and they are roped together, and the race requirement is that no one gets yanked down to the ground - everyone has to finish standing up. Now it seems a bit more obvious that no matter how fast Usain is, he and his teammates basically have to walk alongside the 92-year-old grandmother who is setting the pace for the race. This is how Parity Checks work on Unraid. In my server, my 3TB 5400 RPM drives are the slowest, so they set the pace at 140 MB/s, even though my 8TB 7200 RPM drives can easily exceed 200 MB/s on their own. I'm not sure which drives are slowest in your system, your 4TB drives look like 7200 RPM units, so it might be the 6TB drives. But even though your drive mixture is slowing you down some, even your slowest drive should be good for 150+ MB/s. So something else is slowing your server down. To determine what that bottleneck is, math is your friend. I see that you have 16 drives connected to your SAS2116 PCI-Express Fusion-MPT SAS-2 controller. To understand what kind of bandwidth that controller is seeing, simply multiply the max speed by the number of drives: 16 Drives * 89.2 MB/s = 1,427 MB/s But that is just the drive data throughput. SATA drives use an 8b/10b encoding which has a 20% overhead throughput penalty, so your realized bandwidth is only 80% of what the controller is seeing. So we need to add the overhead back into that number: 1427 MB/s / 0.80 = 1784 MB/s We also need to factor in the PCI-Express overhead. While the 8b/10b protocol overhead in PCIe v1 and v2 is already factored into those speeds, there are additional overheads like TLP that further reduce the published speeds. You might only get at most 92% of published PCI-e bandwidth numbers, possibly less: 1784 MB/s / 0.92 = 1939 MB/s being handled by your PCI-Express slot. 1939 MB/s is a very interesting number, as it is very close to 2000 MB/s, which is equivalent to PCIe v1.0 x 8 lanes, and PCIe v2.0 x 4 lanes. So, long classroom lecture short, most likely what is happening is that your SAS controller is connecting to your system at PCIe 1.0 x8 or PCIe 2.0 x4. I'm not certain what controller you have, but based upon the driver I think the card has a PCIe 2.0 x8 max connection speed, which should be good for double what you are getting (perhaps around 182 MB/s for 16 drives). So you probably have plugged the controller into the wrong slot. On many motherboards, some of the x16 slots are only wired for x4, so while your PCIe 2.0 x8 card would fit in the x16 slot, the speed gets reduced to half-speed, PCIe 2.0 x4. Alternatively, you might have a really old system that only supports PCIe 1.0, which again would cut your speeds in half. Your signature doesn't specify your exact hardware, so I don't know which it would be. One last tip: If you are doing Windows VM's with passthrough graphics, and you are putting your graphics card in the fastest PCIe slot hoping for max speed - that probably isn't needed. I did some testing a couple years back, putting the video card in PCIe 3.0 x 16 and PCIe 3.0 x 4 slots, and in 3D Mark the score was nearly the same. I know all the hardware review websites like to make a big deal about PCIe bandwidth and video cards, but the reality is that for gaming it really doesn't make much of a difference. On the other hand, 16 fast hard drives can easily saturate a PCIe 2.0 x8 connection, so it is very important to put your HD controller in the fastest available slot. </class> Paul
  9. UTT v4.x also sends out a test begin and a test end notification, instead of the hundreds of notifications you would get with out the block. Any chance you're confusing the UTT notifications with the Unraid Parity Check notifications?
  10. Sorry, I should have tried the link that StevenD provided. I didn't realize it wasn't a direct link to the file, but rather to a web page from where you can start a download. This URL should work: http://mirrors.slackware.com/slackware/slackware-current/slackware/ap/screen-4.6.2-i586-2.txz I'll update my post above too.
  11. To expand on StevenD's answer: Change into your UTT directory: cd /boot/utt *** Download screen: wget http://mirrors.slackware.com/slackware/slackware-current/slackware/ap/screen-4.6.2-i586-2.txz Install screen: upgradepkg --install-new screen-4.6.2-i586-2.txz Run screen: screen *** NOTE: You should only have to download screen once, and you can do this from your Window's PC and save it to your \\<servername>\flash\utt directory or via the wget command line above. Each time you reboot, screen is no longer installed as Unraid boots from a static image, so you would still need to do steps 1, 3 & 4, but skip step 2 since you had downloaded it previously.
  12. Yeah, you want to stop any and all access of your shares during the test, from any and all sources.
  13. LOL! So true! Hadn't thought of that... You really only need to use safe mode if you have a ton of stuff that's just too hard to disable individually. Instead, just make sure you stop all VM's and Dockers, plus any plugins that would be accessing your array disks. I haven't noticed any issues from the CacheDirs plug-in, since once it is running it's mainly pinging RAM to prevent disks from spinning up, but you can always stop that one too just to be safe. Other alternatives: You can run directly on your server's console instead of remote access, completely eliminating the need for screen. Using screen is optional, though recommended when running remote. If you have confidence that your network connection is solid, that your PC won't sleep, shutdown, or randomly update and reboot itself during the test, and that power brownouts/blackouts won't disrupt your connection, then screen really isn't needed. Screen is like insurance - many people get by without it. Though I am also curious - how can you run screen in safe mode?
  14. Sorry I missed this comment earlier. I'm not sure what to make of this. UTT performs tests of the Unraid Disk Tunables by running dozens or hundreds of non-correcting parity checks. But if you don't have parity disks... then how in the world are you even running UTT? I don't know if you can trust any of the results - I don't even know what the results mean anymore. If you don't have parity disks, then you shouldn't be able to check parity, and you shouldn't be able to use this tool to check parity check speeds with different tunables configured. That also might explain why negative md_sync_thresh values were responding well on your machine. Is there even a [CHECK] Parity button on your Unraid Main screen?
  15. I just posted UTT v4.1 final, in the first post. Everything you need should be in the first two posts. Perhaps @SpaceInvaderOne could do one of his great videos on using UTT...
  16. Wow. I've said it before and I'll say it again, every server is unique, some in very surprising ways. I scanned through your results, and for repeated tests I see fairly large variances of up to +/- 2.3 MB/s, so keep that in mind when comparing results. The Long test, with a 10 minute duration for each test, should provide more accurate results. Regarding consuming 0 MB, it's actually not 0. I'm rounding to the nearest MB, so anything under 0.5 MB would round down to 0 MB. Here's the formula and your actual result: (( ( md_num_stripes * (2640 + (4096 * sbNumDisks)) ) / 1048576 )) = RAM Consumed (In Megabytes) With your values: (( ( 16 * (2640 + (4096 * 7)) ) / 1048576 )) = 0.477783203 MB *NOTE: I've just added a new function to UTT v4.1 to show memory used in KB when it rounds down to 0 MB. Regarding the negative md_sync_thresh values, I had to double-check the code to see if UTT was really setting negative values, and it is. While UTT is setting negative md_sync_thresh values, I'm not sure if Unraid is overriding the values when they are below a certain threshold. While I know how to read the currently 'configured' value, I don't know how to query the currently 'set' value. Does anyone know how to do this? I did go into the Unraid Disk settings, and manually set a negative value and applied it, and Unraid saved it! So best I can tell, the UTT script is setting negative md_sync_thresh values, Unraid is accepting them, and your server is responding better with them. Perhaps @limetech can share some insight. Paul
  17. Fantastic! That settles it then, I'll release UTT v4.1 final today. Those results look perfect to me. Proof that, even as good as Unraid v6.x performs with stock settings on most servers, some servers still need tuning. Going from 141 MB/s stock to 164 MB/s tuned nets you a nice 16% bump in peak performance. I also find the Thriftiest settings very interesting. Only 22 MB of RAM consumed (16 MB less than stock Unraid), yet a solid 15 MB/s (11%) gain over stock performance. The consistency of your results for the repeated tests is +/- 0.1 MB/s, so you can trust the report accuracy on this server. I really appreciate you doing the Extra Long test. As I expected, the extra nr_requests tests only provided slower speeds once the other settings were tuned. I'm still curious if there will be a server out there that responds well to lower nr_requests values once tuned, but it seems less and less likely. Personally, I'd probably go with the Fastest values on your server. The Recommended values only save you 122 MB over the Fastest, and the Fastest are only consuming 366 MB. If you had a lot more drives, the memory consumption would go up proportionally and the lower Recommended values to save RAM would make more sense then.
  18. Thanks for confirming. I think UTT v4.1 BETA 3 is ready to make the jump to final.
  19. I was starting to feel a bit guilty for still rock'n the beastly 6.6.6, especially while trying to trouble-shoot all these storage report issues for users running 6.7.x. Now I feel a bit vindicated for sticking with Damienraid, and happy I avoided all that SMB/SQLite nonsense. Hopefully my server hasn't sold its circuits to Beezlebub and won't be stuck on 6.6.6 forever in a journey to the bottomless pit... Perhaps I need to rename my server from Tower. Abaddon... Apollyon... Beelzebub... Belial... Dragon... I know, Leviathan!
  20. Small correction on what I wrote here. The mdcmd status output only has drives 0-29, which is predefined by Unraid to Parity and Data disks only. 54 is the flash drive, and 30 & 31 are cache drives (I'm sure there's other predefined assignments, but that is all I've mapped out). So I was getting myself confused as to how I was getting the flash and cache drives to show in the report, since they are not in the mdcmd status output. I finally realized that I am using both mdcmd status and the /var/local/emhttp/disks.ini file to build the DiskName2Num lookup. Looks like /var/local/emhttp/disks.ini has all array drives, up to 54, so it includes the flash and cache. (yes, that means I have an unnecessary, redundant operation using the mdcmd output to build the DiskName2Num lookup, but it doesn't hurt anything) Ultimately the story stays the same - non-array drives aren't in /var/local/emhttp/disks.ini either, so they still don't get in the report.
  21. Thanks @jbartlett! Any chance your two NVMe drives are non-array devices?
  22. My mistake, looks like you were right. I build a DiskName2Num lookup array, but it is based upon the data from mdcmd status, which of course only provides data on array devices. That means these unassigned disks don't get a Disk Name to Disk Number lookup entry, so it's not available for the final report. I'm a little conflicted on this. On the one hand, I wanted the report to be a complete picture of all controllers and attached drives, but on the other hand I guess having it only display array devices is nice too, since these are the only drives being tested and tuned. I don't think I would be able to include non-array drives without a significant rewrite of this report. So.... no. Not gonna happen.
  23. Right. The Short test omits Passes 2 & 3, to make it quicker, and never makes any recommendations - primarily because the 10 second tests are way too quick to be accurate and you get a lot of fake numbers. For some users, their server responds the same no matter what tunables are used. That's the point of the Short test, to save them 8+ hours of running the longer tests if it won't help them.
  24. Why? I'm still on 6.6.6 because I've seen too many issues reported in the 6.6.7 and 6.7.x branches that just never seemed to get any solutions. Looks like those two drives still aren't showing. I'll test again on my side. I got it working with Xaero's data, and assumed it would fix yours too.
  25. Looks good. I'd say your accuracy is +/- 0.2 MB/s. So even though the new fastest recommendation is ever so slightly faster, it's within the error of margin so you likely won't see a difference.