Jump to content

Parity Check fills log, crashes server


Recommended Posts

I am running a parity sync and have lots of problems. 212786 problems to be exact and I'm only 50% of the way through. Likely drive issues aside, I can't seem to access the system log (Tools > System Log) to do any troubleshooting. All I get is this error:

 

Fatal error: Allowed memory size of 134217728 bytes exhausted (tried to allocate 433421704 bytes) in /usr/local/emhttp/plugins/dynamix/include/Syslog.php on line 20

 

I have increased my log size in my go file to 512MB with this command: mount -o remount,size=512m /var/log

 

Does anyone know why my log file is filling up (81% or 414MB), and why I can't access it? The array is online in Maintenance mode during the Parity-Check.

 

Downloading the log file, it's lots of these types of errors:

 

Apr 27 16:16:34 tower kernel: WARNING: CPU: 2 PID: 12696 at drivers/iommu/intel-iommu.c:2300 __domain_mapping+0x205/0x2dd
Apr 27 16:16:34 tower kernel: Modules linked in: xt_CHECKSUM ipt_REJECT ip6table_mangle ip6table_nat nf_nat_ipv6 iptable_mangle ip6table_filter ip6_tables ipt_MASQUERADE iptable_filter iptable_nat nf_nat_ipv4 nf_nat ip_tables xfs md_mod i915 i2c_algo_bit iosf_mbi drm_kms_helper drm intel_gtt agpgart syscopyarea sysfillrect sysimgblt fb_sys_fops it87 hwmon_vid bonding x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul crc32c_intel btusb ghash_clmulni_intel pcbc btrtl btbcm btintel bluetooth aesni_intel aes_x86_64 crypto_simd cryptd glue_helper i2c_i801 i2c_core mxm_wmi e1000e intel_cstate intel_uncore intel_rapl_perf ecdh_generic ahci libahci video wmi pcc_cpufreq backlight thermal button fan [last unloaded: tun]
Apr 27 16:16:34 tower kernel: CPU: 2 PID: 12696 Comm: unraidd0 Tainted: G        W         4.19.107-Unraid #1
Apr 27 16:16:34 tower kernel: Hardware name: Gigabyte Technology Co., Ltd. Z87X-UD3H/Z87X-UD3H-CF, BIOS F9 03/18/2014
Apr 27 16:16:34 tower kernel: RIP: 0010:__domain_mapping+0x205/0x2dd
Apr 27 16:16:34 tower kernel: Code: 48 c7 c7 b7 b6 d7 81 e8 1f a2 c7 ff 8b 05 8b 5c a5 00 85 c0 74 08 ff c8 89 05 7f 5c a5 00 48 c7 c7 79 fc d2 81 e8 01 a2 c7 ff <0f> 0b 8b 54 24 24 b8 34 00 00 00 8d 0c d2 83 e9 09 83 f9 34 0f 4f
Apr 27 16:16:34 tower kernel: RSP: 0018:ffffc90003e379c8 EFLAGS: 00010046
Apr 27 16:16:34 tower kernel: RAX: 0000000000000024 RBX: 0000000818b51002 RCX: 0000000000000007
Apr 27 16:16:34 tower kernel: RDX: 0000000000000000 RSI: 0000000000000002 RDI: ffff88881f1164f0
Apr 27 16:16:34 tower kernel: RBP: 0000000000000001 R08: 0000000000000003 R09: 000000000001ae00
Apr 27 16:16:34 tower kernel: R10: 0000000000000000 R11: 0000000000000044 R12: 00000000000dd51f
Apr 27 16:16:34 tower kernel: R13: ffff8888183aeab0 R14: ffff888817cc0000 R15: ffff8887b9a548f8
Apr 27 16:16:34 tower kernel: FS:  0000000000000000(0000) GS:ffff88881f100000(0000) knlGS:0000000000000000
Apr 27 16:16:34 tower kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Apr 27 16:16:34 tower kernel: CR2: 000014fc1e37c0a0 CR3: 0000000001e0a006 CR4: 00000000001606e0
Apr 27 16:16:34 tower kernel: Call Trace:
Apr 27 16:16:34 tower kernel: domain_mapping+0x16/0xa7
Apr 27 16:16:34 tower kernel: intel_map_sg+0x144/0x189
Apr 27 16:16:34 tower kernel: ata_qc_issue+0x10a/0x195
Apr 27 16:16:34 tower kernel: ? ata_scsi_write_same_xlat+0x2f7/0x2f7
Apr 27 16:16:34 tower kernel: ata_scsi_translate+0xdd/0x14d
Apr 27 16:16:34 tower kernel: ata_scsi_queuecmd+0x254/0x2a8
Apr 27 16:16:34 tower kernel: scsi_dispatch_cmd+0xa2/0xca
Apr 27 16:16:34 tower kernel: scsi_queue_rq+0x395/0x447
Apr 27 16:16:34 tower kernel: blk_mq_dispatch_rq_list+0x2b9/0x491
Apr 27 16:16:34 tower kernel: blk_mq_do_dispatch_sched+0xd0/0xf6
Apr 27 16:16:34 tower kernel: blk_mq_sched_dispatch_requests+0xf7/0x14b
Apr 27 16:16:34 tower kernel: __blk_mq_run_hw_queue+0xaf/0xd6
Apr 27 16:16:34 tower kernel: __blk_mq_delay_run_hw_queue+0x41/0x11f
Apr 27 16:16:34 tower kernel: blk_mq_run_hw_queue+0xb4/0xd4
Apr 27 16:16:34 tower kernel: blk_mq_flush_plug_list+0xc0/0x111
Apr 27 16:16:34 tower kernel: blk_flush_plug_list+0xd7/0x1e8
Apr 27 16:16:34 tower kernel: blk_finish_plug+0x1a/0x27
Apr 27 16:16:34 tower kernel: unraidd+0x130c/0x136e [md_mod]
Apr 27 16:16:34 tower kernel: ? __switch_to_asm+0x35/0x70
Apr 27 16:16:34 tower kernel: ? __schedule+0x4f7/0x548
Apr 27 16:16:34 tower kernel: ? md_thread+0xee/0x115 [md_mod]
Apr 27 16:16:34 tower kernel: ? rmw5_write_data+0x172/0x172 [md_mod]
Apr 27 16:16:34 tower kernel: md_thread+0xee/0x115 [md_mod]
Apr 27 16:16:34 tower kernel: ? wait_woken+0x6a/0x6a
Apr 27 16:16:34 tower kernel: ? md_open+0x2c/0x2c [md_mod]
Apr 27 16:16:34 tower kernel: kthread+0x10c/0x114
Apr 27 16:16:34 tower kernel: ? kthread_park+0x89/0x89
Apr 27 16:16:34 tower kernel: ret_from_fork+0x35/0x40
Apr 27 16:16:34 tower kernel: ---[ end trace 2bce6eca155e7542 ]---
Apr 27 16:16:34 tower kernel: DMAR: ERROR: DMA PTE for vPFN 0xdd51f already set (to 100000 not 7e598d002)

Annotation 2020-04-27 162135.png

Edited by BinaryPatrick
Link to comment

No point in continuing with that. Zero errors are the only acceptable result so you need to start over with it anyway. 

 

Stop, shutdown, check all connections, power and SATA, both ends, including any power splitters. 

 

Then start again and if you get any errors at all, post Diagnostics instead of syslog. 

 

Go to Tools-diagnostics and attach the complete Diagnostics zip file to your NEXT post. 

Link to comment
2 hours ago, BinaryPatrick said:

running a parity sync

Parity sync would mean you were building parity. Your syslog indicates you are doing a correcting parity check, not a sync. The fact that it is correcting so many parity errors suggests you didn't have valid parity for some reason.

 

Can you tell us more about how you got to this place? Did you New Config for some reason and tell it parity was already valid?

 

Link to comment

Yes. I saw some posts in the reddit group that mentioned bad RAM can cause mass parity issues, so I ran MEM86 yesterday. It didn't find any errors. I also ran a short SMART test on each drive, and those returned no errors as well.

 

The parity check just finished. I rebooted and am going to re-run the check again. I feel like I have a failing disk, but I'll see if there are still lots of errors. Not sure what else to check at this point.

Edited by BinaryPatrick
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...