A small update on the possible tg3 related corruption issue, first I know it's a pain for some users updating and losing the NIC, especially if the server is remote, but because of this some users that are affected are now aware of it and what they need to do to avoid it, also note that there were unrelated changes made to the interface rules config, users that lose network because of bonding and other config issues is unrelated to the tg3 issue, like a couple of users from posts above.
As for the affected servers, these are known to be affected if vt-d is enabled:
HP ProLiant MicroServer Gen8
IBM/Lenovo x3100 M5
HP ProLiant ML350p Gen8
HP ProLiant ML310e Gen8
HP ProLiant DL20 Gen9
Also most likely affected:
HP ProLiant ML350 Gen9
After a few hours of use in all of these, and if vt-d is enabled, you should start getting a similar error to this repeating in the log:
May 23 18:58:31 unraidSERVER kernel: DMAR: ERROR: DMA PTE for vPFN 0xbdf79 already set (to bdf79003 not 19a5a1803)
May 23 18:58:31 unraidSERVER kernel: ------------[ cut here ]------------
May 23 18:58:31 unraidSERVER kernel: WARNING: CPU: 19 PID: 47787 at drivers/iommu/intel/iommu.c:2408 __domain_mapping+0x2e5/0x390
This can be followed by some corruption, which can be more or less severe, possibly it can also non existent, but for now I wouldn't risk running a server if the above error appears, it should go away if vt-d is disabled.
Because of the NIC being blacklisted there have been posts from several users running Dell servers with a NIC that uses the tg3 driver, as of now I didn't find any signs of the above error or corruption in those servers, it *might* be safe to continue to run those servers with vt-d on, especially if there are no signs of the above error in the logs.
So does this mean tg3 driver is not the problem? I don't known, it's still might best guess, besides a bunch of Intel devices that I can't believe are the source of the problem or there would be a lot mores cases, I only found the tg3 NIC in common in all the affected servers , so it would be a big coincidence, but can't say for sure since I don't have the hardware to test, hopefully it will be made clearer in the coming days.
@Thorstenfound the same exact issue reported for Ubuntu and ZFS, confirming as suspected that this is a general kernel issue, not an Unraid issue.