Tracking down IOwait cause

DanielPT · November 8, 2023

13 minutes ago, JorgeB said:

And the torrents/downloads are also going to exclusive shares?

Nope becurse downloads are also on the array

JorgeB · November 9, 2023

That can be the reason.

habron · February 13

I've been dealing with the same problem for several years now. In the past, it wasn't a big deal for me because I didn't need the parity, but now, more and more data are on the array that I don't want to lose.

I'm not able to create a parity disk, even when I shut down Docker, remove the VM Manager, delete all plugins, and so on. First, I assumed there was a problem with the motherboard (ASUS ROG STRIX B550-F GAMING), so I updated the motherboard and checked all settings twice. Second, I assumed it was the Adaptec HBA because it was mentioned somewhere that an old firmware on the controller (Adaptec ASR-71605) itself has a problem, so I updated the controller.

When I just copy files between the disks, I get 10 to 20% iowait.

Maybe the controller will not work with Unraid or needs another driver.

I assume none of you have solved the issue, right?

itimpi · February 13

11 minutes ago, habron said:

I'm not able to create a parity disk,

What happens when you try and do this? This seems to be nothing to do with IOwait which is the subject of this thread. Perhaps posting your diagnostics after an attempt will help us spot something?

habron · February 13

i assume, it's because of the iowait situation. because If I add a parity device and start syncing, the iowait increases just as when I copy files. When the iowait lasts too long, it causes read errors on a disk in the array (not always the same one not alwas at the same Time), and if there are too many errors, they occur on the parity disk, and Unraid disables it.

currently i switch from btrfs and xfs to zfs. if this is done by the next Days i will go deeper in to the issue.

im tring now several things so if there strange things in the config dont wonder

inge-diagnostics-20240213-1639.zip

JorgeB · February 13

1 hour ago, habron said:

it causes read errors on a disk in the array (not always the same one not alwas at the same Time),

There's a know issue with that Adaptec controller, try the recently released v6.12.7-rc2, it should fix that issue, if it still fails post new diags after that.

habron · February 15

Quote

no its because because of the IOwait situation, if i put a Parity device in and Start Syncing the IOwait raises same as when i copy files. when the IOwait situation causes TO long

Great support!!! Thanks, that helps. I was able to create the parity!❤️

IO wait is still a topic, but it's possible that the HDDs are the bottleneck.🤷‍♂️

Edited February 15 by habron

JorgeB · February 15

Some i/o wait can be normal, you can also take a look at turbo write for better write performance at the expense of all disks spinning up for writes.

robo22 · February 28

Nice to know I'm not alone in this issue with the IOwait time going up super high and slowing down the server.
I can attach my diagnostic log here if it might help, or post it in a new thread if that's preferred.

don4of4 · May 25

Hey all — I was just wandering around and wanted to share my experience with IOWait.

Background: I have a 220 TiB server with 128 GiB of Ram. Nothing serious running beyond what others in here have been running, on a 32 core and cpu. IoWait regularly at ~50% and critically system unresponsive every minute or so.

Root cause: Fuse getting overwhelmed, which consumed all of the vm_dirty_ratio causing the kernel to lock until everything was flushed to disk.

What helped:

- Install the Tips and Tweaks plugin to allow you to easily set these cache values.

- Reduce the time it takes to start writing to disk by setting vm.dirty_background_ratio to zero or 1-2 (% of your ram) helping reduce the lag between the start of writing to disk and your ram filling up with disk IO.

- Reduce the maximum cache size by reducing vm.dirty_ratio. I found 5 (% of ram) worked reasonably. Remember if you hit this LIMIT everything that was using your ram (vs disk) to write now has to BOTH wait for the flush, and switch to blocking IO. That said your disks are very slow compared to memory, so you need this overall cash to be small or zero to eliminate the instability at flush.

What eliminated the issue:

- Switching my main array to ZFS on unraid. I have been really happy with performance; network throughput is 3x, IOWait is gone; getting almost 10 gbit with a 7 x 20 TiB * 3 vdev configuration on unraid. (A word of warning, I almost lost all my data in a phased transfer to ZFS, so backup your data or consider a new JBOD if you are new to zfs.)

References:

- https://lonesysadmin.net/2013/12/22/better-linux-disk-caching-performance-vm-dirty_ratio/

- Unraid Support

Edited May 25 by don4of4
Adding context.

Tracking down IOwait cause

Recommended Posts

DanielPT

Link to comment

JorgeB

Link to comment

habron

Link to comment

itimpi

Link to comment

habron

Link to comment

JorgeB

Link to comment

habron

Link to comment

JorgeB

Link to comment

robo22

Link to comment

don4of4

Link to comment

Join the conversation