Same, try increasing Samba log level, might be something visible there:
Go to Settings -> SMB -> SMB Extras and add:
log level = 3
logging = syslog
Then check or post the syslog when it happens again, note that Samba will get very chatty so it can fill it up after a few hours.
Disk does look healthy, but sometimes healthy looking disks have issues, since you have other identical disks it rules out any compatibility issues with the NetApp and that specific model, any change you have a spare you try instead?
Respectfully disagree, even if I though this was a bug I can't move it to the bug reports forum, it uses a different database, so IMHO this was the correct forum for the original post, and this is far from a junk forum with multiple daily posts, if appropriate it could then be escalated to the bug reports forum, like it was already suggested twice, you can create a bug report there anytime. It can take a little more time to get support to a VM related issue, but that's mostly because there are fewer users using them, also many issues are hardware specific.
FS on disk1 appears to fixed, now about sdj, I though you were saying the disk dropped offline during this session, but I believe the disk lost the partition, correct? But in that case it happened before rebooting, and Unraid only stores the logs from current boot, so we can't see what happened.
xfs_repair is just for xfs formatted drives, that one uses ext4, you run a check within UD.
No, it only resets the arrays assignments, you can even select to keep current assignments, then just add the new disks and start the array to begin a parity sync (new disks will need to be formatted), the main danger when doing this is if you assign a previous data disk to a parity slot, that would result in data loss.
Yep, look for any LSI with a SAS2008/2308/3008/3408 chipset in IT mode, e.g., 9201-8i, 9211-8i, 9207-8i, 9300-8i, 9400-8i, etc and clones, like the Dell H200/H310 and IBM M1015, these latter ones need to be crossflashed.
You're using an LSI Megaraid controller, those are not recommended, though they can be used, you need to create a RAID0/JBOD for each disk., but note that it's an older SAS1 model limited to 2.2TB, also that if any time you need to change the controller Unraid won't accept the disks. The 10GbE NIC is being detect correctly, should be working, but ti needs to be configured: Settings -> Network Settings, it's currently eth2.
Yes, that bug sucks, but you need to have backups of anything important, many other things can happen making you lose data.
One of the devices appears to have been cleared, likely from previous troubleshooting you did, like trying to re-add it to the pool, make sure you try the recovery options on the FAQ on both devices, if still nothing not much more I can help, you can try looking for more advanced help on IRC (#btrfs) or the btrfs mailing list.
No, you posted on the general support forum and I moved it since this was about KVM, bug reports are here:
https://forums.unraid.net/bug-reports/stable-releases/
Basically yes, metadata will be missing, so recovery is much more difficult, sometimes impossible.
Post current diags, after trying to start the array with both original cache devices.
There are lots of read/write errors on cache2:
Jan 26 10:34:01 Starswirl kernel: BTRFS info (device sdf1): bdev /dev/sdh1 errs: wr 263018, rd 23717, flush 2409, corrupt 0, gen 0
With SSDs this is usually a cable problem, replace both cables and run a scrub, see here for more info.
What error do you get when trying the first recovery option? Also, do you know when the pool was created? There was a bug in v6.7.x where any pool created would have non redundant metadata.