hoff Posted February 16, 2021 Share Posted February 16, 2021 (edited) Hi everyone. Just looking for a quick reality check on anything I have missed or if there is anything I can tune to make this rocket a bit more as I have spent close to 5 days just trying to copy data in without hangs/timeouts. Some of my source shares have large files which are okay on single thread, as they spool up and max the link for 10-120seconds. Extremely large files which take longer than 4-5mins seems to choke the box's writes and the whole platform dies. Small files such as my photo galleries on single thread are fine, but millions of small files I am looking at 2weeks to copy as if I go multiple threads on RCLONE they hang in the same way. Once the data is in there, its going to be consumed mainly via docker services and the off desktop mount point so I dont see long term read/writes having issues but this initial seed load is driving me insane Can I tweak any tunables somewhere I have not found around RAM/schedulers to help? Just finished building out a new Dell R430/Dell SC200 JBOD tray system. 2 x 6Core E5 CPU 96G RAM The system has 4 x SSD Cache Pool in the R430. 2 x 12TB Parity Drives (disabled at this time) 5 x 4TB 7200rpm 2 x 8TB 7200rpm Details of workload I am working to copy data from my old synology units which I have mounted at /tmp/source using NFS. I am copying data into /mnt/user/destination The share is setup with the maximum distribution which I can see if I browse the share in the UI. Cache Tier is disabled on this share due to data sizes Settings Docker - Disabled VM Managed - Disabled Turbo Write - Enabled Parity Drives - Removed Version: 6.9.0-rc2 Looking at above I should have the most efficient method available to me for copying data in. RSYNC will basically slow down from 100MB/s to 0.0001MB/s after about 30mins of work, basically dead. (Single Thread, No Compression, Local to Local copy) RCLONE seems to work better provided I limit it to 1 transfer and BW Limit it to about 70MB/s. If I use multiple threads on RCLONE it will basically melt and hang after about 20-25mins. Restarting any of the copies above resolves the issues. IOwait remains in single digits with a single copy thread and a BW limit of 80MB/s. 2 transfers and I see 3-10 iowait numbers on half the CPUs (rclone, shfs and unraidd* top threads) 4 transfers and I see 5-30 iowait on half the CPUs (rclone, shfs and unraidd* top threads) Initial Thoughts were: Parity Calc so removed the drives from the array Network speeds. Completed an iperf and sat at 900/900 on a 1Gps port for almost 8 hours last night. Turbo write mode which is mentioned a lot, enabled this. Disk issues. I have checked the smart on everything. the 12TB and 1 of the 8TB are brand new. The 8TBs are 6 months old, the 4TB are 16 months old and will be replaced with 8TB's that are 6 months old as soon as the synology sources are empty. Edited February 16, 2021 by hoff Quote Link to comment
hoff Posted February 16, 2021 Author Share Posted February 16, 2021 (edited) To be clear guys. Appreciate getting 100MB is not happening due to limitations of unraid/etc. The bit that's driving me insane is the process hang/io timeouts causing the apps to fail and losing 8-10hours of copy time while asleep. I am going to leave RCLONE on single/20MB/s tonight and see what happens but that's 3days of copy time but better than nothing.... Tried all the RX/TX and flow control settings tonight too on both the source and destination. It seems to all be down to threads, the second I do more than 1 parallel copy it explodes and dies. 1 copy, 1 thread, with a BWlimit of 80MB/s and it seems stable... if ANYTHING happens on the box including my docker container backups, it basically explodes. Yes. Its a single SAS path as the disks are not DP/etc. I am going to investigate this part of it tomorrow. Edited February 16, 2021 by hoff Quote Link to comment
hoff Posted February 16, 2021 Author Share Posted February 16, 2021 (edited) okay. This is all on my SAS connection I think in case someone else finds/reads this... [ 28.369041] mpt2sas_cm0: LSISAS2308: FWVersion(17.00.01.00), ChipRevision(0x05), BiosVersion(07.24.01.00) All my disks are desktop disks so sata so only single channel and I only have a single SAS cable connected Single Link on Dell H310 (1200MB/s*) <=- not my controller but close enough to be honest 8 x 137.5MB/s 12 x 92.5MB/s 16 x 70MB/s 20 x 55MB/s 24 x 47.5MB/s At the moment my array is made up of 8 disks in the tray and from a little more testing based on the data above, I seem to be able to have as many threads as I want as long as I dont go over ~70MB/s.... This puts me basically in the ballpark on the above... EDIT: after 15mins it died gain.... wonder how low I need to set it with threads to survive... So this is what the array can give me no more sooking about it. Edited February 16, 2021 by hoff Quote Link to comment
theruck Posted February 20, 2021 Share Posted February 20, 2021 for small files copy the fastest is FTP. You can also resume FTP transfers easily. this is my test results table if you want to test something else my experience is simillar. after few minutes of huray start it chokes to ridiculous speeds with certains protocols or config combinations. still finding the optimal setup. my setup is way more simple though. i have only 3 SATA drives 2x 4TB ona for parity one for data + 120 GB SSD cache. the hardware is capable of doing better so the sw is the suspect here. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.