January 27, 20206 yr [Unraid 6.8.2] When I mount an External USB HardDrive via unassigned Devices it creates large IOwait Before Mounting After Mounting I'm mounting this drive to transfer some data from the unraid array to the External Drive using binhex-Krusader. These screenshots are from before I've started any transfers. Once I start a transfer it will copy at a good speed for a while Then IOwait will go up and the transfer speed will crash Even after I pause the Copy the IOwait times don't go down When I dismount the Drive the IOWait goes down to Zero The backlog and drive utilization shoots up immediately when I remount the drive I have attached the diagnostics Edit : iotop while drive mounted and with NO file Transfer(Not copying files with Krusader) Edit 2: lsof shows unrelated files being accessed. The file that is being copied is in the Yellow Box, All the files in the Red Box are untreated to the Copy Edit 3: This is when no transfer is taking place. These random files are being accessed from my mounted external USB Hard Drive. Edit 4: It was dynamic cache dirs scanning the drive that was the problem. Thank you robobub [https://forums.unraid.net/profile/100006-robobub/] for figuring it out. I excluded the External Drive from the Cache Dirs and the problem is now resolved. Edit 5: I think this was only a partial fix. I'm still getting High iowait and slow transfer speeds.[I'm getting faster speeds than before that is between 2 and 35 MegaBytes per second once it has slowed down earlier I was seeing speeds in the KiloBytes per second or even Bytes per second after it slows down and iowait is staying below 50% earlier I've seen it go to upto 65%]. I'm not getting hit with high iowait before transfers but after some minutes after I start a transfer the issue returns. lsof isn't showing other unrelated files being accessed though iotop I have also attached new Diagnostics tower-diagnostics-20200127-1248.zip tower-diagnostics-20200128-1247.zip Edited January 28, 20206 yr by Koshy More Info
January 27, 20206 yr One thing you can try is adjusting the I/O Scheduler and queue size on the USB drive (and perhaps even the array drives being written to) I had an issue where large writes would lock up my system with high I/O wait because perhaps the drive controller didn't like it. # Enumerate the drives being written to and read from, or you could do all of them with sd* # note, changing scheduler will change nr_requests echo none > /sys/block/sd[c,d,e]/queue/scheduler echo 4 > /sys/block/sd[c,d,e]/queue/nr_requests Edited January 27, 20206 yr by robobub
January 27, 20206 yr Author 2 minutes ago, robobub said: One thing you can try is adjusting the I/O Scheduler and queue size on the USB drive (and perhaps even the array drives being written to) I had an issue where large writes would lock up my system with high I/O wait. # Enumerate the drives being written to and read from, or you could do all of them with sd* # note, changing scheduler will change nr_requests echo none > /sys/block/sd[c,d,e]/queue/scheduler echo 4 > /sys/block/sd[c,d,e]/queue/nr_requests I'm copying data from the array to the USB drive not the other way round. I'm not writing anything to the array.
January 27, 20206 yr 5 minutes ago, Koshy said: I'm copying data from the array to the USB drive not the other way round. I'm not writing anything to the array. Okay? Your USB drive will have a device ID, and you can change the I/O Scheduler and Queue size for that device. I suggested changing both the source and destination drives. Not that it's too relevant, but the post I linked was moving data from the array to the cache drive. My suggestion applies with any data transfer. It's a simple test that has a chance of helping. Edited January 27, 20206 yr by robobub
January 27, 20206 yr Author 1 minute ago, robobub said: Okay? Your USB drive will have a device ID, and you can change the I/O Scheduler and Queue size for that device. I suggested changing both the source and destination drives. And also iowait becomes large ~20% even before I start a transfer(just after I mount the drive). Will your fix still work? Thank you.
January 27, 20206 yr 2 minutes ago, Koshy said: And also iowait becomes large ~20% even before I start a transfer(just after I mount the drive). Will your fix still work? Thank you. I don't have your exact situation, so I'm not guaranteeing it will fix it. But it takes 1 second to try, and you can always change it back (you can just cat that same location to see what it was before changing). It is a bit odd that it goes up without starting any copying. I would look to see if something is scanning it (e.g. iotop+lsof, or a fancier tool) Edited January 27, 20206 yr by robobub
January 27, 20206 yr Author 1 hour ago, robobub said: I don't have your exact situation, so I'm not guaranteeing it will fix it. But it takes 1 second to try, and you can always change it back (you can just cat that same location to see what it was before changing). It is a bit odd that it goes up without starting any copying. I would look to see if something is scanning it (e.g. iotop+lsof, or a fancier tool) Thank you. Changing the I/O Scheduler and queue size unfortunately did not make a difference. update: iowait went down to 0, but I don't know why. Will Investigate further. update 2:It went up again Edited January 27, 20206 yr by Koshy
January 27, 20206 yr Author 3 hours ago, robobub said: I don't have your exact situation, so I'm not guaranteeing it will fix it. But it takes 1 second to try, and you can always change it back (you can just cat that same location to see what it was before changing). It is a bit odd that it goes up without starting any copying. I would look to see if something is scanning it (e.g. iotop+lsof, or a fancier tool) Any clues on what I should do? Edit: lsof shows unrelated files being accessed. The file that is being copied is in the Yellow Box, All the files in the Red Box are untreated to the Copy Edited January 27, 20206 yr by Koshy
January 27, 20206 yr Something is scanning your /mnt/disks/Seagate_Backup_Plus_Drive with the command find with PID 25568 (the 2nd column in your screenshot, PID) This doesn't show up on your diagnostics what process it is and what spawned it. You can try to trace what was running by running ps axjf | egrep 25568 The first 4 columns will list other processes associated with it, so then you add that to the egrep command. If any of the columns are 0,1,2, I would exclude it as it is one of the root processes. ps axjf | egrep '25568|<column 1>|<column2>|<column 3>' The process number 25568 could change, so make sure to run lsof again to find out what command and PID is currently accessing it. Also, the command I listed earlier to change the schedulers doesn't work. Forgot that you need to use tee if outputting to multiple devices. Though of course what I mentioned earlier about finding what is scanning your drive is more important. # Change all of them excluding the flash drive with sd[b-z], or enumerate specific ones with sd[b,c,e] # note, changing scheduler will change nr_requests echo none | tee /sys/block/sd[b-z]/queue/scheduler echo 4 | tee /sys/block/sd[b-z]/queue/nr_requests Edited January 28, 20206 yr by robobub
January 28, 20206 yr Author 11 hours ago, robobub said: Something is scanning your /mnt/disks/Seagate_Backup_Plus_Drive with the command find with PID 25568 (the 2nd column in your screenshot, PID) This doesn't show up on your diagnostics what process it is and what spawned it. You can try to trace what was running by running ps axjf | egrep 25568 The first 4 columns will list other processes associated with it, so then you add that to the egrep command. If any of the columns are 0,1,2, I would exclude it as it is one of the root processes. ps axjf | egrep '25568|<column 1>|<column2>|<column 3>' The process number 25568 could change, so make sure to run lsof again to find out what command and PID is currently accessing it. Also, the command I listed earlier to change the schedulers doesn't work. Forgot that you need to use tee if outputting to multiple devices. Though of course what I mentioned earlier about finding what is scanning your drive is more important. # Change all of them excluding the flash drive with sd[b-z], or enumerate specific ones with sd[b,c,e] # note, changing scheduler will change nr_requests echo none | tee /sys/block/sd[b-z]/queue/scheduler echo 4 | tee /sys/block/sd[b-z]/queue/nr_requests Here is what I got. I have also attached new diagnostics. Your help is much apricated. Thank You. tower-diagnostics-20200128-1103.zip Edited January 28, 20206 yr by Koshy
January 28, 20206 yr You can see what is launching that find process by going up a few lines: dynamic cache dirs plugin.Add your external hard drive to the excluded folders path of that folder caching plugin
January 28, 20206 yr Author 23 minutes ago, robobub said: You can see what is launching that find process by going up a few lines: dynamic cache dirs plugin. Add your external hard drive to the excluded folders path of that folder caching plugin That Fixed it Thank You.
January 28, 20206 yr Author 2 hours ago, robobub said: You can see what is launching that find process by going up a few lines: dynamic cache dirs plugin. Add your external hard drive to the excluded folders path of that folder caching plugin I think this was only a partial fix. I'm still getting High iowait and slow transfer speeds[I'm getting faster speeds than before that is between 2 and 35 MegaBytes per second once it has slowed down earlier I was seeing speeds in the KiloBytes per second or even Bytes per second after it slows down and iowait is staying below 50% earlier I've seen it go to upto 65%]. I'm not getting hit with high iowait before transfers but after a couple minutes after I start a transfer the issue returns. lsof isn't showing other unrelated files being accessed though iotop 2 hours ago, Koshy said: Also, the command I listed earlier to change the schedulers doesn't work. Forgot that you need to use tee if outputting to multiple devices. Though of course what I mentioned earlier about finding what is scanning your drive is more important. # Change all of them excluding the flash drive with sd[b-z], or enumerate specific ones with sd[b,c,e] # note, changing scheduler will change nr_requests echo none | tee /sys/block/sd[b-z]/queue/scheduler echo 4 | tee /sys/block/sd[b-z]/queue/nr_requests I get this when I try that Edited January 28, 20206 yr by Koshy
January 28, 20206 yr cat /sys/block/sd[b-z]/queue/scheduler cat /sys/block/sd[b-z]/queue/nr_requests With that command you can check the values.
December 12, 20205 yr It looks like I have this issue. I can't see what the exact fix was, if there is one. Can anyone tell me?
December 13, 20205 yr Author 6 hours ago, Nanobug said: It looks like I have this issue. I can't see what the exact fix was, if there is one. Can anyone tell me? I think excluding the USB Drive from Dynamix Cache Directory fixed it.
December 13, 20205 yr 5 hours ago, Koshy said: I think excluding the USB Drive from Dynamix Cache Directory fixed it. How do you do that?
December 13, 20205 yr 3 hours ago, Nanobug said: How do you do that? The Dynamix Cache Directories plugin. Settings->Folder Caching. I think the better way to do it is to just set the folders you want to cache, rather than let it cache everything. It's the 'Included Folders' setting. As this user found out, Cache Directories will add UD disks unless you either set it to only include certain folders, or exclude the UD mounted disk specifically.
December 13, 20205 yr 37 minutes ago, dlandon said: The Dynamix Cache Directories plugin. Settings->Folder Caching. I think the better way to do it is to just set the folders you want to cache, rather than let it cache everything. It's the 'Included Folders' setting. As this user found out, Cache Directories will add UD disks unless you either set it to only include certain folders, or exclude the UD mounted disk specifically. I've tried excluding the two folders it's being used for. I'll get back here when I know if it helped or not.
December 16, 20205 yr Before I did the change, it was always high on CPU usage with IOwait. Now, it's only when it does something with that disk the IOwait goes up. So the overall performance is better, but when it needs to read or write to that disk, it goes up by a lot. I guess I'll have to live with this, until I get a replacement disk.
Archived
This topic is now archived and is closed to further replies.