Unraid server unresponsive during and after parity check until reboot


Recommended Posts

Good day legends!

 

Relatively new user to Unraid, coming from about 6 years of Freenas/Truenas. Everything has been migrated and I'm both feet into the Unraid ecosystem and very happy so far.

 

I do however have an issue, which happens during and after parity check.

When the parity check starts, the server is unresponsive on the Web GUI, as well as on SSH.

 - Web GUI - After many attempts of refreshing the page I eventually get through but the experience is extremely slow.

 - SSH - Using MobaXterm, or SSH from another server on my network, doesn't get through.

 

Once the parity check is finished, it becomes more responsive on the Web GUI but still feels very sluggish. The only way to fix this is by rebooting.

 

I'm attaching diagnostics just after finishing a parity check and experiencing the sluggish performance.

I'm also attaching diagnostics after a reboot, and everything seems normal again.

 

Would appreciate any assistance whatsoever.

prime-diagnostics-20220613-1035.zip prime-diagnostics-20220613-1132.zip

Link to comment

Not sure if your i7 930 is the issue, but I see a call trace a the very beginning of the log

 

Jun 12 11:16:34 Prime kernel: ------------[ cut here ]------------
Jun 12 11:16:34 Prime kernel: Your BIOS is broken; DMA routed to ISOCH DMAR unit but no TLB space.
Jun 12 11:16:34 Prime kernel: BIOS vendor: American Megatrends Inc.; Ver: 1501   ; Product Version: System Version
Jun 12 11:16:34 Prime kernel: WARNING: CPU: 0 PID: 1 at drivers/iommu/intel/iommu.c:5802 intel_iommu_init+0xa60/0x109f
Jun 12 11:16:34 Prime kernel: Modules linked in:
Jun 12 11:16:34 Prime kernel: CPU: 0 PID: 1 Comm: swapper/0 Tainted: G          I       5.15.43-Unraid #1
Jun 12 11:16:34 Prime kernel: Hardware name: System manufacturer System Product Name/P6X58D PREMIUM, BIOS 1501    05/10/2011
Jun 12 11:16:34 Prime kernel: RIP: 0010:intel_iommu_init+0xa60/0x109f
Jun 12 11:16:34 Prime kernel: Code: 49 89 c6 e8 41 08 f6 fe bf 01 00 00 00 49 89 c4 e8 34 08 f6 fe 4c 89 f1 4c 89 e2 48 c7 c7 39 ba 11 82 48 89 c6 e8 af 27 11 ff <0f> 0b 83 0d 34 72 1e 00 04 eb 0c 48 c7 c7 ad ba 11 82 e8 32 30 11
Jun 12 11:16:34 Prime kernel: RSP: 0000:ffffc90000023de0 EFLAGS: 00010282
Jun 12 11:16:34 Prime kernel: RAX: 0000000000000000 RBX: ffffffff82325180 RCX: 0000000000000003
Jun 12 11:16:34 Prime kernel: RDX: 000000000000020d RSI: ffffc90000023c68 RDI: 0000000000000001
Jun 12 11:16:34 Prime kernel: RBP: ffffc90000023ea0 R08: ffffffff822b4e28 R09: 0000000000000000
Jun 12 11:16:34 Prime kernel: R10: 726556206d657473 R11: 7953203a6e6f6973 R12: ffffffff82a0001c
Jun 12 11:16:34 Prime kernel: R13: 0000000000000000 R14: ffffffff82a00060 R15: 0000000000000000
Jun 12 11:16:34 Prime kernel: FS:  0000000000000000(0000) GS:ffff888333a00000(0000) knlGS:0000000000000000
Jun 12 11:16:34 Prime kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jun 12 11:16:34 Prime kernel: CR2: 0000000000000000 CR3: 000000000220a000 CR4: 00000000000006f0
Jun 12 11:16:34 Prime kernel: Call Trace:
Jun 12 11:16:34 Prime kernel: <TASK>
Jun 12 11:16:34 Prime kernel: ? __raw_spin_unlock+0x5/0x6
Jun 12 11:16:34 Prime kernel: ? __queue_work+0x27a/0x289
Jun 12 11:16:34 Prime kernel: ? __down_write_common+0x31/0x42e
Jun 12 11:16:34 Prime kernel: ? _raw_spin_unlock_irqrestore+0x19/0x1b
Jun 12 11:16:34 Prime kernel: ? e820__memblock_setup+0x7b/0x7b
Jun 12 11:16:34 Prime kernel: pci_iommu_init+0x16/0x3f
Jun 12 11:16:34 Prime kernel: do_one_initcall+0x78/0x178
Jun 12 11:16:34 Prime kernel: kernel_init_freeable+0x1d5/0x21f
Jun 12 11:16:34 Prime kernel: ? rest_init+0xbb/0xbb
Jun 12 11:16:34 Prime kernel: kernel_init+0x16/0x115
Jun 12 11:16:34 Prime kernel: ret_from_fork+0x22/0x30
Jun 12 11:16:34 Prime kernel: </TASK>
Jun 12 11:16:34 Prime kernel: ---[ end trace c35b66859ac47c1f ]---

 

 

There are also a lot of network errors later :

Jun 12 23:30:41 Prime kernel: sky2 0000:06:00.0 eth0: rx error, status 0x7ffc0001 length 1020
Jun 12 23:30:41 Prime kernel: sky2 0000:06:00.0 eth0: rx error, status 0x7ffc0001 length 940
Jun 12 23:30:41 Prime kernel: sky2 0000:06:00.0 eth0: rx error, status 0x7ffc0001 length 1388
Jun 12 23:30:55 Prime kernel: sky2 0000:06:00.0: error interrupt status=0x40000008
Jun 12 23:30:55 Prime kernel: sky2 0000:06:00.0 eth0: rx error, status 0x7ffc0001 length 700
Jun 12 23:30:55 Prime kernel: sky2 0000:06:00.0 eth0: rx error, status 0x7ffc0001 length 1510
Jun 12 23:30:55 Prime kernel: sky2 0000:06:00.0: error interrupt status=0x40000008
Jun 12 23:30:55 Prime kernel: sky2 0000:06:00.0 eth0: rx error, status 0x7ffc0001 length 1468
Jun 12 23:30:55 Prime kernel: sky2 0000:06:00.0: error interrupt status=0x40000008
Jun 12 23:30:55 Prime kernel: sky2 0000:06:00.0 eth0: rx error, status 0x7ffc0001 length 1020
Jun 12 23:30:55 Prime kernel: sky2 0000:06:00.0 eth0: rx error, status 0x7ffc0001 length 428
Jun 12 23:30:55 Prime kernel: sky2 0000:06:00.0 eth0: rx error, status 0x7ffc0001 length 236
Jun 12 23:30:55 Prime kernel: sky2 0000:06:00.0: error interrupt status=0x40000008
Jun 12 23:31:49 Prime kernel: net_ratelimit: 1 callbacks suppressed

Probably not the cause of your parity issues, but maybe the connections issue ?

That's probably related :

Jun 13 03:02:12 Prime kernel: br0: hw csum failure
Jun 13 03:02:12 Prime kernel: skb len=750 headroom=78 headlen=750 tailroom=772
Jun 13 03:02:12 Prime kernel: mac=(64,14) net=(78,20) trans=98
Jun 13 03:02:12 Prime kernel: shinfo(txflags=0 nr_frags=0 gso(size=0 type=0 segs=0))
Jun 13 03:02:12 Prime kernel: csum(0xc301ddf4 ip_summed=2 complete_sw=0 valid=0 level=0)
Jun 13 03:02:12 Prime kernel: hash(0x9a7cdc9c sw=0 l4=0) proto=0x0800 pkttype=0 iif=12

 

Someone with more knowledge can probably chime in.

Link to comment

The network errors you highlighted does seem to be related somehow. The parity check started on Jun 12 23:30. Even though the network errors appear before then, they are not as abundant as after it starts. Quite curious why the network log "activity" would increase so much during a parity check?

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.