[6.9.0 beta 35] BTRFS Raid 1 pool issue

Closed

This morning I found all containers and VMS using the BTRFS Raid 1 cache pool irresponsive with numerous log errors, and the syslog was spammed with messages like "Dec 13 06:56:54 NAS kernel: sd 5:0:0:0: [sdi] tag#6 access beyond end of device".

The cache pool is composed of two SATA SSDs :

- sdi : 840 Pro 512GB

- sdh : 860 Evo 500 GB

I quickly understood any write access to the cache pool was reporting errors since Dec 13 06:55:14.

A similar issue has already happened once in 6.9.0 beta25, but today I was clever enough (?) to download diags attached before stopping VM manager and Docker and then reboot. After reboot, I performed a full balance and scrub (no errors) on the pool, then restarted VMs and containers, and everything works fine again. Despite, a parity check was launched after reboot, for whatever reason the shutdown was considered as unclean.

It may be a hardware issue, but I've also always wondered if it was a good idea from me to have a Raid-1 pool with two drives of different capacity, which btw has a reported size of 506GB (!) in the "Main" tab.

Thanks in advance for having a look at the diags and hopefully give me some ideas of how to get rid of this repeated and very worrying instability.

nas-diagnostics-20201213-0948.zip

[6.9.0 beta 35] BTRFS Raid 1 pool issue

Account

Navigation

Search

Configure browser push notifications

Chrome (Android)

Chrome (Desktop)

Safari (iOS 16.4+)

Safari (macOS)

Edge (Android)

Edge (Desktop)

Firefox (Android)

Firefox (Desktop)