Entxawp Posted October 23, 2020 Share Posted October 23, 2020 Hi everyone, I have an issue with my Unraid server that kept crashing while under a heavy plex load. This was because of the UPS runtime left was set too high. but after fixing that the server is still spitting out BTRFS errors and plex even seems to have killed a HT (#18) and is now stuck at 100%. How should I proceed? (recreate docker and btrfs?) Many thanks, Ent entxvault-diagnostics-20201023-1417.zip Quote Link to comment
Entxawp Posted October 23, 2020 Author Share Posted October 23, 2020 Oct 23 15:31:15 entxvault kernel: BTRFS warning (device loop2): csum failed root 1011 ino 45837 off 360448 csum 0x79fcb17b expected csum 0x528975a8 mirror 1 ### [PREVIOUS LINE REPEATED 10 TIMES] ### Oct 23 15:31:34 entxvault kernel: general protection fault: 0000 [#1] SMP NOPTI Oct 23 15:31:34 entxvault kernel: CPU: 16 PID: 1224 Comm: kswapd0 Tainted: G O 4.19.107-Unraid #1 Oct 23 15:31:34 entxvault kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./X399 Taichi, BIOS P3.90 12/04/2019 Oct 23 15:31:34 entxvault kernel: RIP: 0010:dentry_unlink_inode+0xa4/0xc6 Oct 23 15:31:34 entxvault kernel: Code: e7 45 31 c9 45 31 c0 b9 02 00 00 00 4c 89 e2 be 00 04 00 00 e8 df 20 02 00 4c 89 e7 e8 cb 20 02 00 48 8b 45 60 48 85 c0 74 17 <48> 8b 40 40 48 85 c0 74 0e 4c 89 e6 48 89 ef 5d 41 5c e9 1b 0b 8a Oct 23 15:31:34 entxvault kernel: RSP: 0018:ffffc900042cbc18 EFLAGS: 00010282 Oct 23 15:31:34 entxvault kernel: RAX: fffbffff81c1fa40 RBX: ffff88822edccfc0 RCX: 0000000000000000 Oct 23 15:31:34 entxvault kernel: RDX: ffff88822ed6b730 RSI: ffffc900012416e0 RDI: ffff88822ed6b680 Oct 23 15:31:34 entxvault kernel: RBP: ffff8882d5abb5c0 R08: 0000000000000001 R09: ffffc900042cbb78 Oct 23 15:31:34 entxvault kernel: R10: 0000000000000001 R11: ffff88885cf5fb40 R12: ffff88822ed6b600 Oct 23 15:31:34 entxvault kernel: R13: ffff8882d5abb5c0 R14: ffffc900042cbc98 R15: ffff88822edccfc0 Oct 23 15:31:34 entxvault kernel: FS: 0000000000000000(0000) GS:ffff88885d200000(0000) knlGS:0000000000000000 Oct 23 15:31:34 entxvault kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Oct 23 15:31:34 entxvault kernel: CR2: 000014fa7c15c000 CR3: 00000005b7098000 CR4: 00000000003406e0 Oct 23 15:31:34 entxvault kernel: Call Trace: Oct 23 15:31:34 entxvault kernel: __dentry_kill+0xcb/0x135 Oct 23 15:31:34 entxvault kernel: shrink_dentry_list+0x149/0x185 Oct 23 15:31:34 entxvault kernel: prune_dcache_sb+0x56/0x74 Oct 23 15:31:34 entxvault kernel: super_cache_scan+0xee/0x16d Oct 23 15:31:34 entxvault kernel: do_shrink_slab+0x128/0x194 Oct 23 15:31:34 entxvault kernel: shrink_slab+0x11b/0x276 Oct 23 15:31:34 entxvault kernel: shrink_node+0x108/0x3cb Oct 23 15:31:34 entxvault kernel: kswapd+0x451/0x58a Oct 23 15:31:34 entxvault kernel: ? __switch_to_asm+0x41/0x70 Oct 23 15:31:34 entxvault kernel: ? mem_cgroup_shrink_node+0xa4/0xa4 Oct 23 15:31:34 entxvault kernel: kthread+0x10c/0x114 Oct 23 15:31:34 entxvault kernel: ? kthread_park+0x89/0x89 Oct 23 15:31:34 entxvault kernel: ret_from_fork+0x22/0x40 Oct 23 15:31:34 entxvault kernel: Modules linked in: macvlan xt_CHECKSUM ipt_REJECT ip6table_mangle ip6table_nat nf_nat_ipv6 iptable_mangle ip6table_filter ip6_tables vhost_net vhost tap xt_nat veth ipt_MASQUERADE iptable_filter iptable_nat nf_nat_ipv4 nf_nat ip_tables xfs md_mod tun nct6775 hwmon_vid bonding igb(O) edac_mce_amd kvm_amd kvm btusb btrtl btbcm btintel bluetooth mpt3sas k10temp crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel pcbc wmi_bmof mxm_wmi aesni_intel ecdh_generic aes_x86_64 ccp ahci i2c_piix4 crypto_simd i2c_core nvme raid_class pcc_cpufreq cryptd libahci glue_helper scsi_transport_sas nvme_core wmi button acpi_cpufreq [last unloaded: igb] Oct 23 15:31:34 entxvault kernel: ---[ end trace 4db28ab27bc080eb ]--- Oct 23 15:31:34 entxvault kernel: RIP: 0010:dentry_unlink_inode+0xa4/0xc6 Oct 23 15:31:34 entxvault kernel: Code: e7 45 31 c9 45 31 c0 b9 02 00 00 00 4c 89 e2 be 00 04 00 00 e8 df 20 02 00 4c 89 e7 e8 cb 20 02 00 48 8b 45 60 48 85 c0 74 17 <48> 8b 40 40 48 85 c0 74 0e 4c 89 e6 48 89 ef 5d 41 5c e9 1b 0b 8a Oct 23 15:31:34 entxvault kernel: RSP: 0018:ffffc900042cbc18 EFLAGS: 00010282 Oct 23 15:31:34 entxvault kernel: RAX: fffbffff81c1fa40 RBX: ffff88822edccfc0 RCX: 0000000000000000 Oct 23 15:31:34 entxvault kernel: RDX: ffff88822ed6b730 RSI: ffffc900012416e0 RDI: ffff88822ed6b680 Oct 23 15:31:34 entxvault kernel: RBP: ffff8882d5abb5c0 R08: 0000000000000001 R09: ffffc900042cbb78 Oct 23 15:31:34 entxvault kernel: R10: 0000000000000001 R11: ffff88885cf5fb40 R12: ffff88822ed6b600 Oct 23 15:31:34 entxvault kernel: R13: ffff8882d5abb5c0 R14: ffffc900042cbc98 R15: ffff88822edccfc0 Oct 23 15:31:34 entxvault kernel: FS: 0000000000000000(0000) GS:ffff88885d200000(0000) knlGS:0000000000000000 Oct 23 15:31:34 entxvault kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Oct 23 15:31:34 entxvault kernel: CR2: 000014fa7c15c000 CR3: 00000005b7098000 CR4: 00000000003406e0 Oct 23 15:31:44 entxvault kernel: BTRFS warning (device loop2): csum failed root 1011 ino 49129 off 4558848 csum 0x946039f0 expected csum 0xbf15fd23 mirror 1 ### [PREVIOUS LINE REPEATED 8 TIMES] ### Oct 23 15:31:54 entxvault kernel: BTRFS warning (device loop2): csum failed root 1011 ino 45837 off 360448 csum 0x79fcb17b expected csum 0x528975a8 mirror 1 ### [PREVIOUS LINE REPEATED 1 TIMES] ### Oct 23 15:32:00 entxvault kernel: BTRFS warning (device loop2): csum failed root 1011 ino 49129 off 4558848 csum 0x946039f0 expected csum 0xbf15fd23 mirror 1 ### [PREVIOUS LINE REPEATED 1 TIMES] ### Oct 23 15:32:04 entxvault kernel: ------------[ cut here ]------------ Oct 23 15:32:04 entxvault kernel: WARNING: CPU: 18 PID: 17059 at fs/btrfs/inode.c:9333 btrfs_destroy_inode+0xaa/0x206 Oct 23 15:32:04 entxvault kernel: Modules linked in: macvlan xt_CHECKSUM ipt_REJECT ip6table_mangle ip6table_nat nf_nat_ipv6 iptable_mangle ip6table_filter ip6_tables vhost_net vhost tap xt_nat veth ipt_MASQUERADE iptable_filter iptable_nat nf_nat_ipv4 nf_nat ip_tables xfs md_mod tun nct6775 hwmon_vid bonding igb(O) edac_mce_amd kvm_amd kvm btusb btrtl btbcm btintel bluetooth mpt3sas k10temp crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel pcbc wmi_bmof mxm_wmi aesni_intel ecdh_generic aes_x86_64 ccp ahci i2c_piix4 crypto_simd i2c_core nvme raid_class pcc_cpufreq cryptd libahci glue_helper scsi_transport_sas nvme_core wmi button acpi_cpufreq [last unloaded: igb] Oct 23 15:32:04 entxvault kernel: CPU: 18 PID: 17059 Comm: shfs Tainted: G D O 4.19.107-Unraid #1 Oct 23 15:32:04 entxvault kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./X399 Taichi, BIOS P3.90 12/04/2019 Oct 23 15:32:04 entxvault kernel: RIP: 0010:btrfs_destroy_inode+0xaa/0x206 Oct 23 15:32:04 entxvault kernel: Code: ff ff ff 00 74 0e 48 c7 c7 79 fc d2 81 e8 bf 35 e6 ff 0f 0b 48 83 bb 20 ff ff ff 00 74 0e 48 c7 c7 79 fc d2 81 e8 a7 35 e6 ff <0f> 0b 48 83 bb 28 ff ff ff 00 74 0e 48 c7 c7 79 fc d2 81 e8 8f 35 Oct 23 15:32:04 entxvault kernel: RSP: 0018:ffffc90006323998 EFLAGS: 00010246 Oct 23 15:32:04 entxvault kernel: RAX: 0000000000000024 RBX: ffff8882e1ba3700 RCX: 0000000000000000 Oct 23 15:32:04 entxvault kernel: RDX: 0000000000000000 RSI: ffff88885d2964f8 RDI: ffff88885d2964f8 Oct 23 15:32:04 entxvault kernel: RBP: ffff888854eb2800 R08: 000000000000000f R09: ffff8880000b9900 Oct 23 15:32:04 entxvault kernel: R10: 0000000000000000 R11: 0000000000000044 R12: ffff888858bb8000 Oct 23 15:32:04 entxvault kernel: R13: 0000000000000000 R14: 0000000000025eba R15: 0000000000000000 Oct 23 15:32:04 entxvault kernel: FS: 0000148d0653b700(0000) GS:ffff88885d280000(0000) knlGS:0000000000000000 Oct 23 15:32:04 entxvault kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Oct 23 15:32:04 entxvault kernel: CR2: 0000148d0507b020 CR3: 0000000810e50000 CR4: 00000000003406e0 Oct 23 15:32:04 entxvault kernel: Call Trace: Oct 23 15:32:04 entxvault kernel: dispose_list+0x30/0x39 Oct 23 15:32:04 entxvault kernel: prune_icache_sb+0x56/0x74 Oct 23 15:32:04 entxvault kernel: super_cache_scan+0x11a/0x16d Oct 23 15:32:04 entxvault kernel: do_shrink_slab+0x128/0x194 Oct 23 15:32:04 entxvault kernel: shrink_slab+0x20c/0x276 Oct 23 15:32:04 entxvault kernel: shrink_node+0x108/0x3cb Oct 23 15:32:04 entxvault kernel: do_try_to_free_pages+0x1a1/0x300 Oct 23 15:32:04 entxvault kernel: try_to_free_pages+0xb2/0xcd Oct 23 15:32:04 entxvault kernel: __alloc_pages_nodemask+0x423/0xae1 Oct 23 15:32:04 entxvault kernel: ? __switch_to_asm+0x41/0x70 ### [PREVIOUS LINE REPEATED 1 TIMES] ### Oct 23 15:32:04 entxvault kernel: ? __switch_to_asm+0x35/0x70 Oct 23 15:32:04 entxvault kernel: ? __switch_to+0x2a6/0x2fb Oct 23 15:32:04 entxvault kernel: fuse_copy_fill.part.0+0x9e/0x147 Oct 23 15:32:04 entxvault kernel: fuse_copy_one+0x43/0x5c Oct 23 15:32:04 entxvault kernel: fuse_dev_do_read.isra.0+0x4f7/0x650 Oct 23 15:32:04 entxvault kernel: ? do_iter_write+0x14a/0x15c Oct 23 15:32:04 entxvault kernel: ? wait_woken+0x6a/0x6a Oct 23 15:32:04 entxvault kernel: fuse_dev_splice_read+0x91/0x14d Oct 23 15:32:04 entxvault kernel: __se_sys_splice+0x4c2/0x54f Oct 23 15:32:04 entxvault kernel: do_syscall_64+0x57/0xf2 Oct 23 15:32:04 entxvault kernel: entry_SYSCALL_64_after_hwframe+0x44/0xa9 Oct 23 15:32:04 entxvault kernel: RIP: 0033:0x148d07e78c6b Oct 23 15:32:04 entxvault kernel: Code: e8 ca 92 f7 ff 44 8b 4c 24 2c 4c 8b 44 24 20 89 c5 4c 8b 54 24 18 8b 54 24 28 b8 13 01 00 00 48 8b 74 24 10 8b 7c 24 08 0f 05 <48> 3d 00 f0 ff ff 77 31 89 ef 48 89 44 24 08 e8 f1 92 f7 ff 48 8b Oct 23 15:32:04 entxvault kernel: RSP: 002b:0000148d0653add0 EFLAGS: 00000293 ORIG_RAX: 0000000000000113 Oct 23 15:32:04 entxvault kernel: RAX: ffffffffffffffda RBX: 0000000000000036 RCX: 0000148d07e78c6b Oct 23 15:32:04 entxvault kernel: RDX: 0000000000000036 RSI: 0000000000000000 RDI: 0000000000000004 Oct 23 15:32:04 entxvault kernel: RBP: 0000000000000000 R08: 0000000000021000 R09: 0000000000000000 Oct 23 15:32:04 entxvault kernel: R10: 0000000000000000 R11: 0000000000000293 R12: 0000148ce4000ba0 Oct 23 15:32:04 entxvault kernel: R13: 0000000000000000 R14: 0000148cd8000b60 R15: 0000148d0653b700 Oct 23 15:32:04 entxvault kernel: ---[ end trace 4db28ab27bc080ec ]--- Oct 23 15:32:13 entxvault kernel: BTRFS warning (device loop2): csum failed root 1011 ino 49129 off 4558848 csum 0x946039f0 expected csum 0xbf15fd23 mirror 1 ### [PREVIOUS LINE REPEATED 1 TIMES] ### Oct 23 15:32:28 entxvault kernel: BTRFS warning (device loop2): csum failed root 1011 ino 45837 off 360448 csum 0x79fcb17b expected csum 0x528975a8 mirror 1 Quote Link to comment
JorgeB Posted October 23, 2020 Share Posted October 23, 2020 There are several crashes, possibly RAM related, see here, you're overclocking the RAM and that is a known issue with those CPUs, but after fixing that also good to recreate the docker image. Quote Link to comment
Entxawp Posted October 23, 2020 Author Share Posted October 23, 2020 (edited) 5 minutes ago, JorgeB said: There are several crashes, possibly RAM related, see here, you're overclocking the RAM and that is a known issue with those CPUs, but after fixing that also good to recreate the docker image. Hi, the ram is 3200 which is what I'm running at, I also ran a Memtest with 2 passes without any issues. I did run the server with the exact same setup for months (sometimes for up to 50 days of uptime) without any issues. Edited October 23, 2020 by Entxawp Quote Link to comment
JorgeB Posted October 23, 2020 Share Posted October 23, 2020 5 minutes ago, Entxawp said: Hi, the ram is 3200 which is what I'm running at Did you look at the link, it's still way above max AMD supported speeds, so it's an overclock, and known to corrupt data in some cases, even if no errors are detected with memtest. Quote Link to comment
Entxawp Posted October 23, 2020 Author Share Posted October 23, 2020 2 minutes ago, JorgeB said: Did you look at the link, it's still way above max AMD supported speeds, so it's an overclock, and known to corrupt data in some cases, even if no errors are detected with memtest. Thank you for the help then, It is a little sad that I will have to sacrifice this much speed for this issue. Quote Link to comment
JonathanM Posted October 23, 2020 Share Posted October 23, 2020 37 minutes ago, Entxawp said: Thank you for the help then, It is a little sad that I will have to sacrifice this much speed for this issue. I doubt that you will see a speed decrease, in fact I suspect things will work much faster with the memory working in sync with the processor. Pushing speeds past spec causes multiple retries for data that is corrupted but correctable and crashes when the corruption is uncorrectable, which should be eliminated when things are running in spec. Quote Link to comment
Entxawp Posted October 23, 2020 Author Share Posted October 23, 2020 1 minute ago, jonathanm said: I doubt that you will see a speed decrease, in fact I suspect things will work much faster with the memory working in sync with the processor. Pushing speeds past spec causes multiple retries for data that is corrupted but correctable and crashes when the corruption is uncorrectable, which should be eliminated when things are running in spec. I do have a question left, the link doesn't mention threadrippers with 8 slots running at quadchannel (not that I have that right now, I have 4x8gb). Should I ever fill all eight slots can I still run them at 1866? Quote Link to comment
JorgeB Posted October 23, 2020 Share Posted October 23, 2020 4 minutes ago, Entxawp said: the link doesn't mention threadrippers with 8 slots running at quadchannel It does: Quote Link to comment
Entxawp Posted October 23, 2020 Author Share Posted October 23, 2020 Just now, JorgeB said: It does: My apologies, it does. Thank you very much both of you. 1 Quote Link to comment
Entxawp Posted October 23, 2020 Author Share Posted October 23, 2020 2 hours ago, JorgeB said: It does: Hey I just want to say that I stresstested the system for about an hour and it no longer crashes, thank you for your help. I do have one more question, can I still overclock my cpu, or would that cause the memory to be out of sync again? Thank you, Ent Quote Link to comment
JorgeB Posted October 23, 2020 Share Posted October 23, 2020 22 minutes ago, Entxawp said: can I still overclock my cpu, It's up to you, IMHO servers and overclock don't go very well together, but you can do it, just remember to go back to stock before doing any future troubleshooting, to rule that out. Quote Link to comment
Entxawp Posted October 23, 2020 Author Share Posted October 23, 2020 8 minutes ago, JorgeB said: It's up to you, IMHO servers and overclock don't go very well together, but you can do it, just remember to go back to stock before doing any future troubleshooting, to rule that out. Aright, thank you. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.