Falcosc

Members
  • Posts

    44
  • Joined

  • Last visited

  • Days Won

    1

Falcosc last won the day on August 6

Falcosc had the most liked content!

Recent Profile Visitors

101 profile views

Falcosc's Achievements

Rookie

Rookie (2/14)

13

Reputation

  1. It was a bit tricky to remove the bottleneck because it already was tuned to process sequentially with one thread per disk. The issue was around expensive binary calls with a lot of syscalls like file stat, get file attribute, unnecessary subshells for each file and so on. After fixing these I found all the other issues like unwanted parallel IO caused by monitoring. I did also experiment with offloading the really needed but still expensive syscalls into asynchronous subshells. But XFS isn't good enough to handle asynchronous file metadata actions without hurting spinning disk IO too much. If XFS metadata were completely cached, then you would reach nearly peak disk read performance on small files (did tests without Metadata, which did achieve this) But check export does skip all these things because it doesn't need to read or write file attributes if the exported hash matches. For that reason, this process is the fastest one. It is basically directly using the hash list cli argument and only starts custom error handling in case the cli response stream contains verification errors.
  2. Actually you can't see how fast blake3 is because a single thread can handle 4-6GB/s which is way beyond spinning disk performance. So, we configured to run single threaded to be more efficient. The main goal off introducing blake3 was to reduce CPU load. I did benchmark memory and thread settings to get the best small file performance. The biggest issue was the amount of syscalls in the old script. But we had other issues which did slow down spinning disk IO as well. - added blake3 hash support for 2-6 times more hash rate than blake2 - reduced CPU load up to 4 times at same data rate when using blake3 - improve build, verify and check speed for all hash methods on small files - fixed stacking find commands avoiding clean shutdowns while watching control page - fixed starting multiple build status checks while watching control page - added monitor background verify processes and all manual bunker script executions - fixed rare process status detection bug "Operation aborted" - fixed file name truncation if key is broken - inline help for disk table and commands more accessible - fixed multi-language support for buttons - added watch error and warning messages of running or finished progress - added Disk status icon for running build status check
  3. It could be possible that we still need a broken disk2.tmp.end file in case the bug is not obvious. (I worry that the bug is not obvious because at least one test run with simulated corruptions didn't have this issue)
  4. That is the reason why I would like to have the file. I test it once and didn't see any issue. Edit it as much as you like. In the best case, you can reproduce the issue with just normal characters and only 3-5 lines in your .end file
  5. Yea, you can remove your personal stuff and save to see if the issue persists with your modified file
  6. Please share your /tmp/*disk2* files and /var/tmp/*disk2* files. I did add errors to them, and it looks like there is at least on layout function which does not ignore this new content. Is a disk2 process running? Because you should not see any disk2 stuff if no disk 2 process is running. After a process is finished, reload will remove your disk2 progress info.
  7. I thought about making the export files hash specific by myself, but after discovering all the places I did not dare to change the export file names, @bonienl did you forget some of them? I remember there were multiple places where these filenames are used. We have Export File creation, we have export file checks, export file status. And I remember we have an additional hash file definition in the exportrotate script and another one in the watcher script. At least for the export status, you find the error here: https://github.com/bergware/dynamix/blob/master/source/file-integrity/include/ProgressInfo.php#L57
  8. You need to remove your SHA256 export file. In my last post, I told that you can keep your SHA hashes in your extended file attributes, but cannot use the "Check Export" button if you did configure blake3 and have SHA hashes in your export folder. If you want to keep your SHA exports, just rename them to avoid getting picked by the "Check Export" button.
  9. Each hash method gets it's own extended file attribute, so build is enough if you want to keep your blake2 keys. But because the last scan date with file stats are shared, and we don't have a button to just clean the last scan date it would be better if you remove all attributes. Just wanted to make clear that you don't have too. On the other hand, shared scan date isn't tested very well because hash method changes is an uncommon usecase.
  10. I would suggest adding all the small improvements to the change log, in case somebody doesn't follow this thread and want to know if a specific error was fixed: https://github.com/bergware/dynamix/pull/53/files
  11. Yes, but it should be done in 0.4s real time and 3.5s user time if it scaled better. (user time is the sum of all thread times) Because single threaded was done in 3s user time. So, I thought maybe it is not a scaling issue, maybe it is a memory speed issue because for 60gb/s hashing rate you may need memory speed of 80-160gb/s which your platform doesn't have. But at these insane speeds, many things could be limiting the scaleability of the multithreaded execution. Your CPU is just too fast for blake3
  12. I don't know how good hash calculations should scale, but spending more than twice the CPU time on multithreaded blake3 (8s multi vs 3s single) sounds a bit too wasteful. I mean, it is good that they at least give this possibility in case somebody would need the 23gb/s throughput, even if less efficient. But this just confirmed my feeling that we should keep it single threaded to improve efficiency. Maybe memory access time counts as user time? Because perfect scaling should result into 60gb/s which would be already the limit of your memory performance. We need more memory channels and DDR5 🤓 On huge files it would be most efficient to have single threaded but with mmap because on your 23gb file run no-mmap you spend additional 900ms sys time for all the little file access calls. I don't want b3sum to pop up in the memory usage graph as a big resource hugger to avoid wrong perceptions of this wonderful tool, so I would sacrifice the sys time saving for having an invisible memory footprint with no-mmap.
  13. This ridiculous speed is the reason why the addition of blake3 is more about reducing CPU stress. But I managed to get a 2-3 Times speed improvement by reduce the number of expensive CLIs calls per file in the build, check and verification process. I did avoid writing numbers into the change log because the script speed improvement is only related to file count. With an average of 2,8mb/file (180.000 files at 500GB) I got from 65mb/s to 170mb/s on any hash method.
  14. Remember, with AVX-512 you can calculate 8 64bit integer in one go. So basically, 8 times speed on a single thread. That's just cheating if correctly implemented
  15. It depends on your CPU. Without AVX it uses SSE4.1 and is only 2 times faster than blake2. But with AVX it is just crazy: 1 thread 6gb/s. Because having more than 10-60gb/s speed per disk is really uncommon even in the far future of Unraid I did force b3sum to run in single thread mode and I did force it to not use mmap. mmap is an expensive syscall to put your file into userspace memory, users did already complain that having mmap does reduce the performance on small files too much because this syscall takes much longer then hashing the whole file. For that reason, the recent version of the b3sum CLI does skip mmap on files smaller than 16kb. Multithreading has overhead, too. So disabling multithreading does give even more small file performance. And if you use single threading, you don't benefit from mmap, you don't need to put your file into memory because you sequentially read it anyway. If you have multiple disks in your array, you don't need multithreading at all. So for the Unraid use case, it doesn't make sense to even make the parameters of b3sum configurable in the UI. I have a 2-Thread G4400T in temporary use, which is already faster than my array on blake3 despite being nearly the worst-case CPU for Unraid. On some CPUs with Intel SHA Extension, blake3 is even 2 times faster than Hardware accelerated if running single threaded https://github.com/xoofx/Blake3.NET#results-with-sha-cpu-extensions Running multithreaded is just unnecessary computing overhead, nobody has storage to benefit from 10 times more speed than Hardware accelerated SHA https://news.ycombinator.com/item?id=22236507 But you can just test it by yourself wget https://github.com/BLAKE3-team/BLAKE3/releases/download/1.0.0/b3sum_linux_x64_bin time b3sum_linux_x64_bin largefile time b3sum_linux_x64_bin --num-threads=1 --no-mmap largefile