High IOWait with docker when parity enable

maximushugus · May 15, 2022

Hello everyone,

I'm writing here because I have a problem I'm not able to solve alone.

First here is my Unraid server configuration :

- CPU : Intel 8400T

- 16 GB of RAM

- Array composed of 2 SATA SSD (with 1 of those 2 drive being the parity drive)

Because I only have SSD in my server, I don't have a cache drive.

- 20 docker container, a lot of them have dedicated IPv4 and IPv6 on custom interface (not host not bridge to be accessible directly on my network). Docker is configured with a docker.img in BTRFS.

- 2 VM : my pfSense firewall with PCIe passthrough for my NIC, and my HomeAssistant VM

Description of the problem :

Since a few days, I found my Unraid server very unresponsive.

A lot of time, I cannot access my dockers (Plex, Nextcloud etc) web interfaces. I also cannot access Unraid Docker tab (or with a lot of pain), and every docker command (docker container stop for example) is painfully slow (in minutes)

On the other hand, my VM were working as expected, with normal speed.

So I tried to search what caused this behavior.

Searching the cause of the problem :

First I found that the load average (with htop) was very high, above 20. But CPU utilization with HTOP was low (5-10%). So I suspected IOWait.

By running iotop I can see multiple command using a lot of IO : "unraidd1", "docker ..." etc.

The IOWait was also confirmed by Netdata (here is an example but IOWait can go above 90%) :

So first I tried to see if this high IOWait was because of a particular docker container. But even shutting down all container, the IOWait keeps being high for a few minutes before coming back to normal value (<3%), so it looks as it was not because of a container but with docker itself. When IOWait come back to low value, the docker tab/docker commands become responsive again.

I also saw in Netdata that during high IOWait, the disk usage was at 100% on the parity drive but not on other drive.

A temporary solution :

So after a lot of searching I tried to disable parity (by stoping array and removing parity disk) in Unraid, and I do not have the high IOWait problem anymore.

I need help :

Does someone have an explanation and a solution for this problem ?

I didn't have this problem a few weeks ago...

IOWait is so high that I'm not able to sync parity again : if docker is enabled and my containers started, the problem happens immediately and parity sync is painfully slow : about 2MB/s on SSD, and every docker is not responsive ...

Do you think this is a hardware failure ?

I changed a few months ago the parity SSD Drive, so it's almost new.

You can find my Unraid diagnostic below.

Thanks a lot

Edited June 6, 2022 by maximushugus

Squid · May 15, 2022

Both the system share and appdata share exist on disk 1, which is going to result in what you're seeing.

Parity (regardless of whether it's an SSD or not) incurs a big write penalty. It's automatically 4x slower on writes by default. Since you're using all SSDs, enable "Reconstruct Write" in Settings - Disk Settings to speed things up.

Ideally though, the system share and appdata should really be on a separate pool outside of the array for the absolute best performance (backing up to the array if needed on a schedule)

Keeping this stuff outside of the traditional array also has another huge benefit: Trim support. The array does not support trim, but cache pools do. (Install Dynamix SSD Trim if using a cache pool formatted as XFS). Lack of trim support is also going to degrade performance over time.

maximushugus · May 15, 2022

Thanks for your answer.

What is weired is that I didn't have this problem before.

The IOWait completely locks docker.

Trim (with the plugin) isn't supported on the array ?

itimpi · May 15, 2022

15 minutes ago, maximushugus said:

Trim (with the plugin) isn't supported on the array ?

Correct - it is not supported.

maximushugus · May 15, 2022

Do you have an idea why have this IOWait problem only recently (since a few days) ?

Because of performance degradation ?

Squid · May 15, 2022

IO Wait is that the system is waiting for IO. This could be because of no Trim support since writes take forever after empty blocks get used over time, and until it's trimmed it will never recover the performance.

maximushugus · May 15, 2022

I'm getting an nvme SSD in cache for system and appdata shares.

Do you believe I should use btrfs for my main array of SATA SSD ?

As far as I know, btrfs has build in support for trim. Is unraid compatible with this ?

Squid · May 16, 2022

If the drives are in the "array" (parity, data) then it does not support trim.

If it's in a cache pool then yes BTRFS automatically supports trim. XFS you need to install Dynamix SSD Trim.

(If you're running a single device pool, then XFS is the better choice to use than BTRFS)

905jay · August 31, 2023

@Squid reading through this thread, I can see that my IOwait is over 20, sometimes close to 30 (or more)

I made the bone-headed move of evacuating my 4 array disks one at a time, and changing them to ZFS because I figured that was smart.

@MAM59 corrected me due to very painfully slow transfer speeds of 16-20 MB/s on my array, that ZFS was the culprit and I should re-convert all my disks back to XFS.

ZFS Array Transfers very slow

All my array disks were Seagate Ironwolf 8TB CMR 7200k (2 parity, 4 array) attached via LSI SAS-9211-8i HBA with breakout cables.

I performed an offsite backup or the critically important data, backed up the flash, did a new config process, and unassigned the 2 parity disks.

I will be replacing the 2 8TB parity disks, with a single Ironwolf 12TB (still nervous about 2 parity being overkill but I'll submit)

Now I am evacuating disk 1-4 of the array and reformatting them back to XFS.

Disk 1 is complete and is XFS, and I am moving all the data from Disk 2, over to 1 at approx. 160 MB/s.

All this being said, I followed @SpaceInvaderOne video on converting the cache NVME (Samsung 990 Pro 1TB ) to ZFS, and thought I'd be smart and also do all 4 array disks. Reading about TRIM support, and high IOWait times, Is this a recommended setup having the cache NVME as ZFS? The final outcome will look like this:

Parity: Ironwolf 12TB

Array: 4x Ironwolf 8TB (XFS)

Cache: Samsung 990 Pro (ZFS)

Should I convert that cache NVME back to xfs also? My end goal is to protect against a disk failure with parity but still have a somewhat decent array transfer speeds with relatively low CPU and IOWait. This server doesn't host any VMs at all.

16-20 MB/s was disgusting, and my IOWait is far too high so I'm trying to narrow down where my issue lies.

image.png.18bc0dfd7c6866b0f78915c93947cdba.png

Edited August 31, 2023 by 905jay
added detail about all disks being attached via HBA. Corrected incorrectly quoted parity disk size

MAM59 · August 31, 2023

48 minutes ago, 905jay said:

that ZFS was the culprit

Nono 🙂

(maybe my bad english was not clear enough once more)

ZFS itself is ok, but when used in the Array and with Parity Drives, it becomes an IO Hog and the system slows down enourmously.

ZFS is always "working in the background", so every few second it tries to optimize the drives. If there are many ZFS drive (single ZFS, not a ZFS-RAID-Array) they work independent of each other, stepping around wildly. This would not harm anything, they only do this on leisure time.

But there is a Parity Drive (which ZFS is not aware of), that needs to follow all these "optimizations" and this really sums up and brings IO to the limit in some situations.

ZFS on a cache drive is ok, but wasted. Cache drives are usually very fast (at least they should be) and SSDs so there is no benefit from the optimizations.

Edited August 31, 2023 by MAM59

905jay · August 31, 2023

8 hours ago, MAM59 said:

Nono 🙂

(maybe my bad english was not clear enough once more)

ZFS itself is ok, but when used in the Array and with Parity Drives, it becomes an IO Hog and the system slows down enourmously.

ZFS is always "working in the background", so every few second it tries to optimize the drives. If there are many ZFS drive (single ZFS, not a ZFS-RAID-Array) they work independent of each other, stepping around wildly. This would not harm anything, they only do this on leisure time.

But there is a Parity Drive (which ZFS is not aware of), that needs to follow all these "optimizations" and this really sums up and brings IO to the limit in some situations.

ZFS on a cache drive is ok, but wasted. Cache drives are usually very fast (at least they should be) and SSDs so there is no benefit from the optimizations.

Your English is perfect, I didn't deliver the message correctly @MAM59.

I will add that correction: ZFS is perfectly fine for non parity protected disks, however with a parity disk (or 2 parity disks) it is more if a drawback in the array, than a benefit.

The reason it is a drawback in the array is because ZFS performs optimizations and self-correcting. And every optimization and correction has to be written to parity (or 2 parity disks in my case) which was causing high IO and high CPU use.

High IOWait with docker when parity enable

Recommended Posts

maximushugus

Link to comment

Squid

Link to comment

maximushugus

Link to comment

itimpi

Link to comment

maximushugus

Link to comment

Squid

Link to comment

maximushugus

Link to comment

Squid

Link to comment

905jay

Link to comment

MAM59

Link to comment

905jay

Link to comment

Join the conversation