20_100 Posted November 5, 2022 Share Posted November 5, 2022 (edited) Hi, I have a growing number of files, around 500.000 , on a single share, spread around 7 disks (XFS). These files are between 30Mb and 120Mb. They are in the same directory. I have severe performance issues. I would like to discuss a local, specific issue, and the generic/general best practices to improve the situation. From a local unraid terminal, the same ls command, which returns a small subset of the files, while being instant on any individual drive ls mnt/disk{disknumber}/Myshare/Thefolder/*AAA*.* takes forever (almost 1 minute) when I execute on the merged file system ls mnt/user0/Myshare/Thefolder/*AAA*.* The issue is the same when it comes to insert new file. What I want to explore in this thread is Is this normal/to be expected. Are there pieces of documentation that cover this topic Are there figures documented in Unraid that Does it depend on the number of drives Does it depend on how files are spread on the disks if so, how does each share allocation method influence this topic The file names are random. Would using sub-directories help? if so, how would directory split level influence this topic as CPU, RAM, I/O usage don't seem to react, what tools would you use to investigate the bottleneck Edited November 5, 2022 by 20_100 Quote Link to comment
JorgeB Posted November 6, 2022 Share Posted November 6, 2022 User shares always have some extra overhead, creating subfolders will help, another thing that should help a little or a lot, depending on the hardware used, is to disable the security mitigations. Quote Link to comment
Frank1940 Posted November 6, 2022 Share Posted November 6, 2022 One thing to try is to spin all the disks up on the array first. Question: How many items are being listed when you are experiencing a "almost 1 minute' delay? My observation is that this output listing from the ls command is also sorted alphabetically. You should realize that this is going to take longer and longer as the number of items increase. (This time increase depends on which sort routine that was implemented by the Linxu/Unix(?) developer who first wrote this command as some sort routines are much faster than others--- particularly as the number of items increase!!!) Quote Link to comment
20_100 Posted November 6, 2022 Author Share Posted November 6, 2022 1 hour ago, Frank1940 said: One thing to try is to spin all the disks up on the array first. Question: How many items are being listed when you are experiencing a "almost 1 minute' delay? My observation is that this output listing from the ls command is also sorted alphabetically. You should realize that this is going to take longer and longer as the number of items increase. (This time increase depends on which sort routine that was implemented by the Linxu/Unix(?) developer who first wrote this command as some sort routines are much faster than others--- particularly as the number of items increase!!!) 100ish results. I can't imagine it takes time to sort that 🙂 Quote Link to comment
20_100 Posted November 6, 2022 Author Share Posted November 6, 2022 2 hours ago, JorgeB said: User shares always have some extra overhead, creating subfolders will help, another thing that should help a little or a lot, depending on the hardware used, is to disable the security mitigations. I will try creating folders and report Quote Link to comment
Frank1940 Posted November 6, 2022 Share Posted November 6, 2022 1 hour ago, 20_100 said: 100ish results. I can't imagine it takes time to sort that 🙂 I looked at one directory that has 430 items in it. It took ls about 2seconds to begin the display. Now, as a disclosure, this directory is cached using Dynamix Cache Directories so no direct disk access is required to grab the required file data to generate the list. I did go looking for a single directory with hundreds of items in it outside of that one but I could not find one. (I learned a long, long time ago if you want to quickly find a single file, you should have some organization structure to how you store things...) Now, about the time to sort items, here is a link to a Wikipedia article and the comparison of the various sort algorithms. You will notice that there are tradeoffs between Time required, Memory required, and Code size. Oh, and hardware can make a difference-- faster CPU = less time for example... https://en.wikipedia.org/wiki/Sorting_algorithm#Comparison_of_algorithms I have not idea which sort algorithm is used by ls. Quote Link to comment
theruck Posted November 6, 2022 Share Posted November 6, 2022 try to list the directory with different command if it take so much time too as it might be caused by ls itself. try find or use ls without the sorting - ls -U should be quicker also coloring of files in bash can have an impact on the output it can be also filesystem dependent so you can have different results using XFS Quote Link to comment
20_100 Posted November 6, 2022 Author Share Posted November 6, 2022 23 minutes ago, theruck said: try to list the directory with different command if it take so much time too as it might be caused by ls itself. try find or use ls without the sorting - ls -U should be quicker also coloring of files in bash can have an impact on the output it can be also filesystem dependent so you can have different results using XFS The same very high latency behaviors happen when FTP services try to list the directory, when I try listing the directory with SMB, sshfs, NFS, the ls command, the "get-childitem" command of a powershell running in docker. I didn't find a way to list this directory without issue. It also happens when I try to just insert a new file, with the above methods. Quote Link to comment
20_100 Posted November 6, 2022 Author Share Posted November 6, 2022 26 minutes ago, theruck said: try to list the directory with different command if it take so much time too as it might be caused by ls itself. try find or use ls without the sorting - ls -U should be quicker also coloring of files in bash can have an impact on the output it can be also filesystem dependent so you can have different results using XFS Listing 100 000 files from a single drive works instantly on the same server, as long as I use the command on /mnt/disk_ instead of /mnt/user Quote Link to comment
20_100 Posted November 6, 2022 Author Share Posted November 6, 2022 (edited) 35 minutes ago, theruck said: try to list the directory with different command if it take so much time too as it might be caused by ls itself. try find or use ls without the sorting - ls -U should be quicker also coloring of files in bash can have an impact on the output it can be also filesystem dependent so you can have different results using XFS Just tried with -U, exactly same behavior Edited November 6, 2022 by 20_100 Quote Link to comment
theruck Posted November 7, 2022 Share Posted November 7, 2022 then you need to try different filesystem or use disk shares instead Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.