Unraid OS version 6.10.0 available


Recommended Posts

I have a Dell R710 with a dual nic (second disabled) Broadcom BCM5709 with kernel driver bnx2. I have seen similar kernel DMAR errors as others have posted here. I also ran into an issue with the bzfirmware file not having the correct checksum after each new RC upgrade until I switched the flash to my front USB port which made upgrades correctly extract bzfirmware. My cache drive scrum does report 4 uncorrectable errors and docker (i believe) has on multiple occasions hung the entire system where I could not reboot. Finally maybe unrelated I did run into two CRC errors on my parity drive which I had to acknowledge and re-add my parity drive to recover from. 

 

Currently I am working on backing up my appdata off the cache pool to rebuilt it because of the uncorrectable csum errors and also wanted to delete and rebuild my docker.img. In bios my virtualization is set to true as I would like to setup HomeAssistant, but I can hold off to see if I am in the same boat as others and I should disable virtualization for now before going through and rebuilding my cache pool and docker image. I have docker off for now and no errors have taken place.

NIC Info (note NIC2 is disabled):

01:00.0 Ethernet controller: Broadcom Inc. and subsidiaries NetXtreme II BCM5709 Gigabit Ethernet (rev 20)
	DeviceName: Embedded NIC 1
	Subsystem: Dell PowerEdge R710 BCM5709 Gigabit Ethernet
    Kernel driver in use: bnx2
01:00.1 Ethernet controller: Broadcom Inc. and subsidiaries NetXtreme II BCM5709 Gigabit Ethernet (rev 20)
	DeviceName: Embedded NIC 2
	Subsystem: Dell PowerEdge R710 BCM5709 Gigabit Ethernet
    Kernel driver in use: bnx2


BTRFS errors after boot scrub (and start of DMAR errors):

Jun  9 18:18:14 kernel: BTRFS info (device sdb1): scrub: started on devid 1
Jun  9 18:18:14 kernel: BTRFS info (device sdb1): scrub: started on devid 2
Jun  9 18:18:48 kernel: BTRFS warning (device sdb1): checksum error at logical 11833892864 on dev /dev/sdb1, physical 6465183744, root 5, inode 3598121, offset 200704, length 4096, links 1 (path: appdata/sabnzbd/logs/sabnzbd.log.1)
Jun  9 18:18:48 kernel: BTRFS error (device sdb1): bdev /dev/sdb1 errs: wr 0, rd 0, flush 0, corrupt 12, gen 0
Jun  9 18:18:48 kernel: BTRFS error (device sdb1): unable to fixup (regular) error at logical 11833892864 on dev /dev/sdb1
Jun  9 18:18:48 kernel: BTRFS warning (device sdb1): checksum error at logical 11833892864 on dev /dev/sdc1, physical 6444212224, root 5, inode 3598121, offset 200704, length 4096, links 1 (path: appdata/sabnzbd/logs/sabnzbd.log.1)
Jun  9 18:18:48 kernel: BTRFS error (device sdb1): bdev /dev/sdc1 errs: wr 0, rd 0, flush 0, corrupt 9, gen 0
Jun  9 18:18:48 kernel: BTRFS warning (device sdb1): checksum error at logical 11833896960 on dev /dev/sdb1, physical 6465187840, root 5, inode 3598121, offset 204800, length 4096, links 1 (path: appdata/sabnzbd/logs/sabnzbd.log.1)
Jun  9 18:18:48 kernel: BTRFS error (device sdb1): bdev /dev/sdb1 errs: wr 0, rd 0, flush 0, corrupt 13, gen 0
Jun  9 18:18:48 kernel: BTRFS error (device sdb1): unable to fixup (regular) error at logical 11833892864 on dev /dev/sdc1
Jun  9 18:18:48 kernel: BTRFS error (device sdb1): unable to fixup (regular) error at logical 11833896960 on dev /dev/sdb1
Jun  9 18:18:48 kernel: BTRFS warning (device sdb1): checksum error at logical 11833896960 on dev /dev/sdc1, physical 6444216320, root 5, inode 3598121, offset 204800, length 4096, links 1 (path: appdata/sabnzbd/logs/sabnzbd.log.1)
Jun  9 18:18:48 kernel: BTRFS error (device sdb1): bdev /dev/sdc1 errs: wr 0, rd 0, flush 0, corrupt 10, gen 0
Jun  9 18:18:48 kernel: BTRFS error (device sdb1): unable to fixup (regular) error at logical 11833896960 on dev /dev/sdc1
Jun  9 18:19:05 kernel: DMAR: ERROR: DMA PTE for vPFN 0xbf4df already set (to bf4df003 not 1011b0801)
Jun  9 18:19:05 kernel: ------------[ cut here ]------------
Jun  9 18:19:05 kernel: WARNING: CPU: 20 PID: 7515 at drivers/iommu/intel/iommu.c:2408 __domain_mapping+0x2e5/0x390
Jun  9 18:19:05 kernel: Modules linked in: xfs md_mod iptable_nat xt_MASQUERADE nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 wireguard curve25519_x86_64 libcurve25519_generic libchacha20poly1305 chacha_x86_64 poly1305_x86_64 ip6_udp_tunnel udp_tunnel libblake2s blake2s_x86_64 libblake2s_generic libchacha ip6table_filter ip6_tables iptable_filter ip_tables x_tables bonding bnx2 ipmi_ssif i2c_core intel_powerclamp coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel crypto_simd cryptd intel_cstate intel_uncore mpt3sas input_leds led_class ata_piix ipmi_si raid_class wmi acpi_power_meter scsi_transport_sas button acpi_cpufreq [last unloaded: bnx2]
Jun  9 18:19:05 kernel: CPU: 20 PID: 7515 Comm: apcupsd Tainted: G          I       5.15.43-Unraid #1
Jun  9 18:19:05 kernel: Hardware name: Dell Inc. PowerEdge R710/0YMXG9, BIOS 6.6.0 05/22/2018

 

kernel DMAR errors (these repeat several times on boot):

Jun  9 18:19:06 kernel: DMAR: ERROR: DMA PTE for vPFN 0xbf4de already set (to bf4de003 not 1011b0801)
Jun  9 18:19:06 kernel: ------------[ cut here ]------------
Jun  9 18:19:06 kernel: WARNING: CPU: 20 PID: 7515 at drivers/iommu/intel/iommu.c:2408 __domain_mapping+0x2e5/0x390
Jun  9 18:19:06 kernel: Modules linked in: xfs md_mod iptable_nat xt_MASQUERADE nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 wireguard curve25519_x86_64 libcurve25519_generic libchacha20poly1305 chacha_x86_64 poly1305_x86_64 ip6_udp_tunnel udp_tunnel libblake2s blake2s_x86_64 libblake2s_generic libchacha ip6table_filter ip6_tables iptable_filter ip_tables x_tables bonding bnx2 ipmi_ssif i2c_core intel_powerclamp coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel crypto_simd cryptd intel_cstate intel_uncore mpt3sas input_leds led_class ata_piix ipmi_si raid_class wmi acpi_power_meter scsi_transport_sas button acpi_cpufreq [last unloaded: bnx2]
Jun  9 18:19:06 kernel: CPU: 20 PID: 7515 Comm: apcupsd Tainted: G        W I       5.15.43-Unraid #1
Jun  9 18:19:06 kernel: Hardware name: Dell Inc. PowerEdge R710/0YMXG9, BIOS 6.6.0 05/22/2018
Jun  9 18:19:06 kernel: RIP: 0010:__domain_mapping+0x2e5/0x390
Jun  9 18:19:06 kernel: Code: 2b 48 8b 4c 24 08 48 89 c2 4c 89 e6 48 c7 c7 fe b1 11 82 e8 ef 87 2d 00 8b 05 1d 9b df 00 85 c0 74 08 ff c8 89 05 11 9b df 00 <0f> 0b 8b 74 24 38 b8 34 00 00 00 8d 0c f6 83 e9 09 39 c1 0f 4f c8
Jun  9 18:19:06 kernel: RSP: 0018:ffffc900081b7ad8 EFLAGS: 00010006
Jun  9 18:19:06 kernel: RAX: 0000000000000003 RBX: ffff8881087136f0 RCX: 0000000000000000
Jun  9 18:19:06 kernel: RDX: 0000000000000000 RSI: ffff8897dfd1c510 RDI: ffff8897dfd1c510
Jun  9 18:19:06 kernel: RBP: ffff88810806e300 R08: ffff88983ffa3e68 R09: 0000000000000000
Jun  9 18:19:06 kernel: R10: 2931303830623131 R11: 623131303120746f R12: 00000000000bf4de
Jun  9 18:19:06 kernel: R13: ffff8881087136f0 R14: 0000000000001000 R15: 00000000bf4de000
Jun  9 18:19:06 kernel: FS:  000014a59cb33f00(0000) GS:ffff8897dfd00000(0000) knlGS:0000000000000000
Jun  9 18:19:06 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jun  9 18:19:06 kernel: CR2: 0000000000462cb8 CR3: 0000000100da6005 CR4: 00000000000206e0
Jun  9 18:19:06 kernel: Call Trace:
Jun  9 18:19:06 kernel: <TASK>
Jun  9 18:19:06 kernel: ? alloc_iova+0x30/0x1a8
Jun  9 18:19:06 kernel: intel_iommu_map_pages+0xf3/0x102
Jun  9 18:19:06 kernel: __iommu_map+0x138/0x211
Jun  9 18:19:06 kernel: _iommu_map+0x23/0x51
Jun  9 18:19:06 kernel: __iommu_dma_map+0x94/0xc4
Jun  9 18:19:06 kernel: iommu_dma_map_page+0x100/0x142
Jun  9 18:19:06 kernel: usb_hcd_map_urb_for_dma+0xe5/0x2ee
Jun  9 18:19:06 kernel: usb_hcd_submit_urb+0x675/0x72b
Jun  9 18:19:06 kernel: ? update_load_avg+0x43/0x2ce
Jun  9 18:19:06 kernel: ? get_sd_balance_interval+0x18/0x3b
Jun  9 18:19:06 kernel: ? rpm_resume+0x45d/0x484
Jun  9 18:19:06 kernel: ? newidle_balance+0x21c/0x291
Jun  9 18:19:06 kernel: ? dequeue_entity+0x1e0/0x203
Jun  9 18:19:06 kernel: ? usb_submit_urb+0x2d6/0x4ed
Jun  9 18:19:06 kernel: hid_submit_ctrl+0x1c4/0x201
Jun  9 18:19:06 kernel: usbhid_restart_ctrl_queue.isra.0+0x8c/0xc7
Jun  9 18:19:06 kernel: usbhid_submit_report+0x25b/0x2e6
Jun  9 18:19:06 kernel: hiddev_ioctl+0x301/0x558
Jun  9 18:19:06 kernel: ? hrtimer_nanosleep+0x75/0xdf
Jun  9 18:19:06 kernel: ? hrtimer_init_sleeper+0x3d/0x3d
Jun  9 18:19:06 kernel: vfs_ioctl+0x1e/0x2b
Jun  9 18:19:06 kernel: __do_sys_ioctl+0x51/0x74
Jun  9 18:19:06 kernel: do_syscall_64+0x83/0xa5
Jun  9 18:19:06 kernel: entry_SYSCALL_64_after_hwframe+0x44/0xae
Jun  9 18:19:06 kernel: RIP: 0033:0x14a59cdfd067
Jun  9 18:19:06 kernel: Code: 3c 1c e8 2c ff ff ff 85 c0 79 97 49 c7 c4 ff ff ff ff 5b 5d 4c 89 e0 41 5c c3 66 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d d1 1d 0d 00 f7 d8 64 89 01 48
Jun  9 18:19:06 kernel: RSP: 002b:00007ffcb5fa99f8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
Jun  9 18:19:06 kernel: RAX: ffffffffffffffda RBX: 0000000000437370 RCX: 000014a59cdfd067
Jun  9 18:19:06 kernel: RDX: 00007ffcb5fa9a04 RSI: 00000000400c4807 RDI: 0000000000000006
Jun  9 18:19:06 kernel: RBP: 00000000004711c0 R08: 0000000000000000 R09: 0000000000000000
Jun  9 18:19:06 kernel: R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000474650
Jun  9 18:19:06 kernel: R13: 00007ffcb5fa9a70 R14: 000000000000a000 R15: 0000000000009fff
Jun  9 18:19:06 kernel: </TASK>
Jun  9 18:19:06 kernel: ---[ end trace 8747f0981c782280 ]---

 

Manual BTRFS scrub results:

UUID:             <redacted>
Scrub started:    Thu Jun  9 23:43:02 2022
Status:           finished
Duration:         0:06:09
Total to scrub:   143.61GiB
Rate:             350.23MiB/s
Error summary:    csum=4
  Corrected:      0
  Uncorrectable:  4
  Unverified:     0

 

@JorgeB I can run and send a diagnostics, I don't believe I have one when the server hung and needed a hard reboot.

Edited by DannyR83
Link to comment
2 hours ago, DannyR83 said:

I can run and send a diagnostics, I don't believe I have one when the server hung and needed a hard reboot.

Feel free to send them, but looks like the same exact issue.

 

2 hours ago, DannyR83 said:

I also ran into an issue with the bzfirmware file not having the correct checksum after each new RC upgrade until I switched the flash to my front USB port which made upgrades correctly extract bzfirmware.

We believe this is related, a lot of Dell users had to add iommu=pt to be able to boot v6.10.x, we also believe that by doing this it prevent them from having the DMAR errors, suggest you update to v6.10.3-rc1, it now comes with iommu=pt by default, should fix this DMAR/corruption issue for all affected platforms.

 

 

 

  • Like 1
Link to comment
1 hour ago, JorgeB said:

We believe this is related, a lot of Dell users had to add iommu=pt to be able to boot v6.10.x, we also believe that by doing this it prevent them from having the DMAR errors, suggest you update to v6.10.3-rc1, it now comes with iommu=pt by default, should fix this DMAR/corruption issue for all affected platforms.

Thank you!! Upgraded to v6.10.3-rc1 (there was a hick-up trying to restart, had to cold reboot, assuming same issues lead to this) and there were no errors at all. I will reformat my cache pool tomorrow and remake my docker.img and if I run into further issues I will send my diagnostics file.

  • Like 1
Link to comment
On 6/15/2022 at 1:13 AM, JorgeB said:

This is no longer needed for v6.10.3, it already comes by default with iommu=pt.

Hot damn!  I did try the manual iommu=pt commands and got 6.10.2 up and going for the last couple weeks.  I was holding off on the 6.10.3 update pending verification that they'd fixed this issue.

Link to comment
  • 1 month later...

In the Network Interface, it's really difficult to see that 'General Info' is actually a drop-down menu, as it doesn't follow standard GUI design of a rectangle box with a down arrow... just the arrow.  I had to search to see if there was a way to see a graph, and stumbled across the answer on Page 3 of this topic.  Also noticed that you can change from bond0 to eth0.  My humble suggestion, could we make this more prominent, perhaps by using standard GUI formatting?   Thank you for your consideration. 🙂

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.