Koshy Posted January 27, 2020 Share Posted January 27, 2020 (edited) [Unraid 6.8.2] When I mount an External USB HardDrive via unassigned Devices it creates large IOwait Before Mounting After Mounting I'm mounting this drive to transfer some data from the unraid array to the External Drive using binhex-Krusader. These screenshots are from before I've started any transfers. Once I start a transfer it will copy at a good speed for a while Then IOwait will go up and the transfer speed will crash Even after I pause the Copy the IOwait times don't go down When I dismount the Drive the IOWait goes down to Zero The backlog and drive utilization shoots up immediately when I remount the drive I have attached the diagnostics Edit : iotop while drive mounted and with NO file Transfer(Not copying files with Krusader) Edit 2: lsof shows unrelated files being accessed. The file that is being copied is in the Yellow Box, All the files in the Red Box are untreated to the Copy Edit 3: This is when no transfer is taking place. These random files are being accessed from my mounted external USB Hard Drive. Edit 4: It was dynamic cache dirs scanning the drive that was the problem. Thank you robobub [https://forums.unraid.net/profile/100006-robobub/] for figuring it out. I excluded the External Drive from the Cache Dirs and the problem is now resolved. Edit 5: I think this was only a partial fix. I'm still getting High iowait and slow transfer speeds.[I'm getting faster speeds than before that is between 2 and 35 MegaBytes per second once it has slowed down earlier I was seeing speeds in the KiloBytes per second or even Bytes per second after it slows down and iowait is staying below 50% earlier I've seen it go to upto 65%]. I'm not getting hit with high iowait before transfers but after some minutes after I start a transfer the issue returns. lsof isn't showing other unrelated files being accessed though iotop I have also attached new Diagnostics tower-diagnostics-20200127-1248.zip tower-diagnostics-20200128-1247.zip Edited January 28, 2020 by Koshy More Info Quote Link to comment
robobub Posted January 27, 2020 Share Posted January 27, 2020 (edited) One thing you can try is adjusting the I/O Scheduler and queue size on the USB drive (and perhaps even the array drives being written to) I had an issue where large writes would lock up my system with high I/O wait because perhaps the drive controller didn't like it. # Enumerate the drives being written to and read from, or you could do all of them with sd* # note, changing scheduler will change nr_requests echo none > /sys/block/sd[c,d,e]/queue/scheduler echo 4 > /sys/block/sd[c,d,e]/queue/nr_requests Edited January 27, 2020 by robobub Quote Link to comment
Koshy Posted January 27, 2020 Author Share Posted January 27, 2020 2 minutes ago, robobub said: One thing you can try is adjusting the I/O Scheduler and queue size on the USB drive (and perhaps even the array drives being written to) I had an issue where large writes would lock up my system with high I/O wait. # Enumerate the drives being written to and read from, or you could do all of them with sd* # note, changing scheduler will change nr_requests echo none > /sys/block/sd[c,d,e]/queue/scheduler echo 4 > /sys/block/sd[c,d,e]/queue/nr_requests I'm copying data from the array to the USB drive not the other way round. I'm not writing anything to the array. Quote Link to comment
robobub Posted January 27, 2020 Share Posted January 27, 2020 (edited) 5 minutes ago, Koshy said: I'm copying data from the array to the USB drive not the other way round. I'm not writing anything to the array. Okay? Your USB drive will have a device ID, and you can change the I/O Scheduler and Queue size for that device. I suggested changing both the source and destination drives. Not that it's too relevant, but the post I linked was moving data from the array to the cache drive. My suggestion applies with any data transfer. It's a simple test that has a chance of helping. Edited January 27, 2020 by robobub Quote Link to comment
Koshy Posted January 27, 2020 Author Share Posted January 27, 2020 1 minute ago, robobub said: Okay? Your USB drive will have a device ID, and you can change the I/O Scheduler and Queue size for that device. I suggested changing both the source and destination drives. And also iowait becomes large ~20% even before I start a transfer(just after I mount the drive). Will your fix still work? Thank you. Quote Link to comment
robobub Posted January 27, 2020 Share Posted January 27, 2020 (edited) 2 minutes ago, Koshy said: And also iowait becomes large ~20% even before I start a transfer(just after I mount the drive). Will your fix still work? Thank you. I don't have your exact situation, so I'm not guaranteeing it will fix it. But it takes 1 second to try, and you can always change it back (you can just cat that same location to see what it was before changing). It is a bit odd that it goes up without starting any copying. I would look to see if something is scanning it (e.g. iotop+lsof, or a fancier tool) Edited January 27, 2020 by robobub Quote Link to comment
Koshy Posted January 27, 2020 Author Share Posted January 27, 2020 (edited) 1 hour ago, robobub said: I don't have your exact situation, so I'm not guaranteeing it will fix it. But it takes 1 second to try, and you can always change it back (you can just cat that same location to see what it was before changing). It is a bit odd that it goes up without starting any copying. I would look to see if something is scanning it (e.g. iotop+lsof, or a fancier tool) Thank you. Changing the I/O Scheduler and queue size unfortunately did not make a difference. update: iowait went down to 0, but I don't know why. Will Investigate further. update 2:It went up again Edited January 27, 2020 by Koshy Quote Link to comment
Koshy Posted January 27, 2020 Author Share Posted January 27, 2020 (edited) 3 hours ago, robobub said: I don't have your exact situation, so I'm not guaranteeing it will fix it. But it takes 1 second to try, and you can always change it back (you can just cat that same location to see what it was before changing). It is a bit odd that it goes up without starting any copying. I would look to see if something is scanning it (e.g. iotop+lsof, or a fancier tool) Any clues on what I should do? Edit: lsof shows unrelated files being accessed. The file that is being copied is in the Yellow Box, All the files in the Red Box are untreated to the Copy Edited January 27, 2020 by Koshy Quote Link to comment
robobub Posted January 27, 2020 Share Posted January 27, 2020 (edited) Something is scanning your /mnt/disks/Seagate_Backup_Plus_Drive with the command find with PID 25568 (the 2nd column in your screenshot, PID) This doesn't show up on your diagnostics what process it is and what spawned it. You can try to trace what was running by running ps axjf | egrep 25568 The first 4 columns will list other processes associated with it, so then you add that to the egrep command. If any of the columns are 0,1,2, I would exclude it as it is one of the root processes. ps axjf | egrep '25568|<column 1>|<column2>|<column 3>' The process number 25568 could change, so make sure to run lsof again to find out what command and PID is currently accessing it. Also, the command I listed earlier to change the schedulers doesn't work. Forgot that you need to use tee if outputting to multiple devices. Though of course what I mentioned earlier about finding what is scanning your drive is more important. # Change all of them excluding the flash drive with sd[b-z], or enumerate specific ones with sd[b,c,e] # note, changing scheduler will change nr_requests echo none | tee /sys/block/sd[b-z]/queue/scheduler echo 4 | tee /sys/block/sd[b-z]/queue/nr_requests Edited January 28, 2020 by robobub Quote Link to comment
Koshy Posted January 28, 2020 Author Share Posted January 28, 2020 (edited) 11 hours ago, robobub said: Something is scanning your /mnt/disks/Seagate_Backup_Plus_Drive with the command find with PID 25568 (the 2nd column in your screenshot, PID) This doesn't show up on your diagnostics what process it is and what spawned it. You can try to trace what was running by running ps axjf | egrep 25568 The first 4 columns will list other processes associated with it, so then you add that to the egrep command. If any of the columns are 0,1,2, I would exclude it as it is one of the root processes. ps axjf | egrep '25568|<column 1>|<column2>|<column 3>' The process number 25568 could change, so make sure to run lsof again to find out what command and PID is currently accessing it. Also, the command I listed earlier to change the schedulers doesn't work. Forgot that you need to use tee if outputting to multiple devices. Though of course what I mentioned earlier about finding what is scanning your drive is more important. # Change all of them excluding the flash drive with sd[b-z], or enumerate specific ones with sd[b,c,e] # note, changing scheduler will change nr_requests echo none | tee /sys/block/sd[b-z]/queue/scheduler echo 4 | tee /sys/block/sd[b-z]/queue/nr_requests Here is what I got. I have also attached new diagnostics. Your help is much apricated. Thank You. tower-diagnostics-20200128-1103.zip Edited January 28, 2020 by Koshy Quote Link to comment
robobub Posted January 28, 2020 Share Posted January 28, 2020 You can see what is launching that find process by going up a few lines: dynamic cache dirs plugin.Add your external hard drive to the excluded folders path of that folder caching plugin 1 Quote Link to comment
Koshy Posted January 28, 2020 Author Share Posted January 28, 2020 23 minutes ago, robobub said: You can see what is launching that find process by going up a few lines: dynamic cache dirs plugin. Add your external hard drive to the excluded folders path of that folder caching plugin That Fixed it Thank You. Quote Link to comment
Koshy Posted January 28, 2020 Author Share Posted January 28, 2020 (edited) 2 hours ago, robobub said: You can see what is launching that find process by going up a few lines: dynamic cache dirs plugin. Add your external hard drive to the excluded folders path of that folder caching plugin I think this was only a partial fix. I'm still getting High iowait and slow transfer speeds[I'm getting faster speeds than before that is between 2 and 35 MegaBytes per second once it has slowed down earlier I was seeing speeds in the KiloBytes per second or even Bytes per second after it slows down and iowait is staying below 50% earlier I've seen it go to upto 65%]. I'm not getting hit with high iowait before transfers but after a couple minutes after I start a transfer the issue returns. lsof isn't showing other unrelated files being accessed though iotop 2 hours ago, Koshy said: Also, the command I listed earlier to change the schedulers doesn't work. Forgot that you need to use tee if outputting to multiple devices. Though of course what I mentioned earlier about finding what is scanning your drive is more important. # Change all of them excluding the flash drive with sd[b-z], or enumerate specific ones with sd[b,c,e] # note, changing scheduler will change nr_requests echo none | tee /sys/block/sd[b-z]/queue/scheduler echo 4 | tee /sys/block/sd[b-z]/queue/nr_requests I get this when I try that Edited January 28, 2020 by Koshy Quote Link to comment
RedReddington Posted January 28, 2020 Share Posted January 28, 2020 cat /sys/block/sd[b-z]/queue/scheduler cat /sys/block/sd[b-z]/queue/nr_requests With that command you can check the values. Quote Link to comment
Koshy Posted January 28, 2020 Author Share Posted January 28, 2020 I had another issue that might be related to this Quote Link to comment
Nanobug Posted December 12, 2020 Share Posted December 12, 2020 It looks like I have this issue. I can't see what the exact fix was, if there is one. Can anyone tell me? Quote Link to comment
Koshy Posted December 13, 2020 Author Share Posted December 13, 2020 6 hours ago, Nanobug said: It looks like I have this issue. I can't see what the exact fix was, if there is one. Can anyone tell me? I think excluding the USB Drive from Dynamix Cache Directory fixed it. Quote Link to comment
Nanobug Posted December 13, 2020 Share Posted December 13, 2020 5 hours ago, Koshy said: I think excluding the USB Drive from Dynamix Cache Directory fixed it. How do you do that? Quote Link to comment
dlandon Posted December 13, 2020 Share Posted December 13, 2020 3 hours ago, Nanobug said: How do you do that? The Dynamix Cache Directories plugin. Settings->Folder Caching. I think the better way to do it is to just set the folders you want to cache, rather than let it cache everything. It's the 'Included Folders' setting. As this user found out, Cache Directories will add UD disks unless you either set it to only include certain folders, or exclude the UD mounted disk specifically. Quote Link to comment
Nanobug Posted December 13, 2020 Share Posted December 13, 2020 37 minutes ago, dlandon said: The Dynamix Cache Directories plugin. Settings->Folder Caching. I think the better way to do it is to just set the folders you want to cache, rather than let it cache everything. It's the 'Included Folders' setting. As this user found out, Cache Directories will add UD disks unless you either set it to only include certain folders, or exclude the UD mounted disk specifically. I've tried excluding the two folders it's being used for. I'll get back here when I know if it helped or not. Quote Link to comment
Nanobug Posted December 16, 2020 Share Posted December 16, 2020 Before I did the change, it was always high on CPU usage with IOwait. Now, it's only when it does something with that disk the IOwait goes up. So the overall performance is better, but when it needs to read or write to that disk, it goes up by a lot. I guess I'll have to live with this, until I get a replacement disk. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.