Thank you for the deep dive into inner workings of btrfs. Back when we implemented VM manager we indeed wanted to provide vdisk redundancy via btrfs raid1 profile. Our early testing showed a very obvious performance hit with a COW vdisk vs. NoCOW. This was circa 2016/2017? and we were aware of the discussion and patch set that ultimately arose. Actually my assumption was that btrfs metadata would keep track of which chunks were successfully committed to storage - apparently this is not the case? Also it has always bugged me that btrfs would not maintain checksums across NoCOW chunks. I can't think of a logical reason why this decision would be made in the code. edit: I guess to avoid read/modify/write.
Sure, we can change the default to COW for the domains share. I think your testing shows that best performance in this case happens when vdisk files are also pre-allocated correct? Also, changing the default will have no effect on an existing domains share. To get rid of existing NoCOW flags, one must empty the domains share, delete it, and then recreate it.
Moving forward into Unraid 6.11 we plan to introduce a "LVM" pool type of up to 3 mirrored devices. This will be used to create logical volumes to be used as vdisks in VM's. This should provide near bare-metal storage performance since we bypass completely any intermediate file system where vdisk (loopback) files are stored.