Multiple hard crashes


Recommended Posts

You have a Marvell-based disk controller

03:00.0 SATA controller [0106]: Marvell Technology Group Ltd. 88SE9235 PCIe 2.0 x2 4-port SATA 6 Gb/s Controller [1b4b:9235] (rev 11)

It's based on the 9235 chip, which seems much less troublesome than the RAID-enabled 9230, which has caused a lot of problems recently for a number of people. I've actually got the 9235 in one of my servers and it has never caused me any problems but different people's experiences vary so much. Interestingly, the 9235 isn't included on the original list, while the 9230 is:

 

Link to comment

Yes, I purchased it several years ago before reading about the issues it can cause. I have been using that card since 2017 without issue, but I realize the potential for problems now.  I plan on replacing it with a LSI Logic SAS 9207-8i.  I purchased one off ebay, but it was faulty so I had to return it.  I will purchase another and swap the card out.

Link to comment

Parity check speed dropping rapidly to nothing.  just noticed this in the log.

 

Nov 15 07:56:25 Tower kernel: nginx[8137]: segfault at 200000000010 ip 00000000004247a3 sp 00007ffd0e5d1f90 error 4 in nginx[420000+101000] Nov 15 07:56:25 Tower kernel: Code: 1f 84 00 00 00 00 00 48 8b 7b 08 48 85 ff 74 05 e8 62 c6 ff ff 48 8b 1b 48 85 db 75 ea 48 8b 5d 10 eb 0b 0f 1f 40 00 48 89 dd <48> 8b 5b 10 48 89 ef e8 41 c6 ff ff 48 85 db 75 ec 48 83 c4 08 5b Nov 15 07:56:25 Tower nginx: 2019/11/15 07:56:25 [alert] 8136#8136: worker process 8137 exited on signal 11

 

EDIT* I'm fairly sure this segfault caused the parity check speed to drop to nothing.  I stopped and restarted the parity check, and speeds jumped right back up to 150Mb/s.

tower-diagnostics-20191115-1259.zip

Edited by mattekure
Link to comment
Nov 15 09:46:53 Tower kernel: BUG: unable to handle kernel paging request at 0000200000000010
Nov 15 09:46:53 Tower kernel: PGD 0 P4D 0 
Nov 15 09:46:53 Tower kernel: Oops: 0000 [#1] SMP PTI
Nov 15 09:46:53 Tower kernel: CPU: 8 PID: 13121 Comm: sensors Tainted: P           O      4.19.56-Unraid #1
Nov 15 09:46:53 Tower kernel: Hardware name: MSI MS-7885/X99A SLI PLUS(MS-7885), BIOS 1.E0 06/15/2018
Nov 15 09:46:53 Tower kernel: RIP: 0010:__lookup_mnt+0x3e/0x5a
Nov 15 09:46:53 Tower kernel: Code: 48 01 d0 48 89 c2 48 d3 ea 48 01 d0 48 8b 15 a5 c2 d4 00 23 05 b3 c2 d4 00 48 8d 04 c2 48 8b 10 31 c0 48 85 d2 74 1e 48 89 d0 <48> 8b 48 10 48 8d 51 20 48 39 d7 75 06 48 39 70 18 74 08 48 8b 00
Nov 15 09:46:53 Tower kernel: RSP: 0018:ffffc90020ae7c08 EFLAGS: 00010206
Nov 15 09:46:53 Tower kernel: RAX: 0000200000000000 RBX: ffffc90020ae7ca8 RCX: ffffa8889bcae180
Nov 15 09:46:53 Tower kernel: RDX: ffffa8889bcae1a0 RSI: ffff88889f0613c0 RDI: ffff88889bcae1a0
Nov 15 09:46:53 Tower kernel: RBP: ffffc90020ae7d60 R08: 61c8864680b583eb R09: 000000005eef0496
Nov 15 09:46:53 Tower kernel: R10: ffffc90020ae7c4c R11: fffffffffc5cec29 R12: ffffc90020ae7ca0
Nov 15 09:46:53 Tower kernel: R13: ffffc90020ae7c9c R14: 0000000000000001 R15: 0000000000200000
Nov 15 09:46:53 Tower kernel: FS:  00001494da9d3740(0000) GS:ffff88889fa00000(0000) knlGS:0000000000000000
Nov 15 09:46:53 Tower kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Nov 15 09:46:53 Tower kernel: CR2: 0000200000000010 CR3: 0000000809ca4001 CR4: 00000000001606e0
Nov 15 09:46:53 Tower kernel: Call Trace:
Nov 15 09:46:53 Tower kernel: __follow_mount_rcu+0x56/0xc0
Nov 15 09:46:53 Tower kernel: lookup_fast+0xfa/0x27a
Nov 15 09:46:53 Tower kernel: walk_component+0xc2/0x249
Nov 15 09:46:53 Tower kernel: ? link_path_walk.part.8+0x1ed/0x42d
Nov 15 09:46:53 Tower kernel: path_lookupat.isra.10+0x12c/0x1e7
Nov 15 09:46:53 Tower kernel: filename_lookup.part.18+0x69/0xcc
Nov 15 09:46:53 Tower kernel: ? _cond_resched+0x1b/0x1e
Nov 15 09:46:53 Tower kernel: ? kmem_cache_alloc+0x30/0xf3
Nov 15 09:46:53 Tower kernel: ? getname_flags+0x44/0x14c
Nov 15 09:46:53 Tower kernel: user_statfs+0x3d/0x93
Nov 15 09:46:53 Tower kernel: __se_sys_statfs+0x20/0x4c
Nov 15 09:46:53 Tower kernel: ? handle_mm_fault+0x158/0x1a7
Nov 15 09:46:53 Tower kernel: ? __do_page_fault+0x379/0x40b
Nov 15 09:46:53 Tower kernel: do_syscall_64+0x57/0xf2
Nov 15 09:46:53 Tower kernel: entry_SYSCALL_64_after_hwframe+0x44/0xa9
Nov 15 09:46:53 Tower kernel: RIP: 0033:0x1494dac27027
Nov 15 09:46:53 Tower kernel: Code: 44 00 00 48 8b 05 69 8e 0d 00 64 c7 00 16 00 00 00 b8 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 b8 89 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 39 8e 0d 00 f7 d8 64 89 01 48
Nov 15 09:46:53 Tower kernel: RSP: 002b:00007ffeed1b9178 EFLAGS: 00000206 ORIG_RAX: 0000000000000089
Nov 15 09:46:53 Tower kernel: RAX: ffffffffffffffda RBX: 00007ffeed1b9498 RCX: 00001494dac27027
Nov 15 09:46:53 Tower kernel: RDX: 00001494dad03000 RSI: 00007ffeed1b9180 RDI: 00001494dad292a0
Nov 15 09:46:53 Tower kernel: RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
Nov 15 09:46:53 Tower kernel: R10: 0000000000000005 R11: 0000000000000206 R12: 0000000000000000
Nov 15 09:46:53 Tower kernel: R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
Nov 15 09:46:53 Tower kernel: Modules linked in: xfs md_mod nct6775 hwmon_vid nvidia_drm(PO) nvidia_modeset(PO) nvidia(PO) x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel drm_kms_helper kvm drm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel pcbc mxm_wmi aesni_intel aes_x86_64 crypto_simd cryptd agpgart e1000e i2c_i801 i2c_core glue_helper intel_cstate syscopyarea sysfillrect sysimgblt fb_sys_fops intel_uncore ahci pcc_cpufreq libahci wmi intel_rapl_perf button
Nov 15 09:46:53 Tower kernel: CR2: 0000200000000010
Nov 15 09:46:53 Tower kernel: ---[ end trace 805b7d055c559bcf ]---
Nov 15 09:46:53 Tower kernel: RIP: 0010:__lookup_mnt+0x3e/0x5a
Nov 15 09:46:53 Tower kernel: Code: 48 01 d0 48 89 c2 48 d3 ea 48 01 d0 48 8b 15 a5 c2 d4 00 23 05 b3 c2 d4 00 48 8d 04 c2 48 8b 10 31 c0 48 85 d2 74 1e 48 89 d0 <48> 8b 48 10 48 8d 51 20 48 39 d7 75 06 48 39 70 18 74 08 48 8b 00
Nov 15 09:46:53 Tower kernel: RSP: 0018:ffffc90020ae7c08 EFLAGS: 00010206
Nov 15 09:46:53 Tower kernel: RAX: 0000200000000000 RBX: ffffc90020ae7ca8 RCX: ffffa8889bcae180
Nov 15 09:46:53 Tower kernel: RDX: ffffa8889bcae1a0 RSI: ffff88889f0613c0 RDI: ffff88889bcae1a0
Nov 15 09:46:53 Tower kernel: RBP: ffffc90020ae7d60 R08: 61c8864680b583eb R09: 000000005eef0496
Nov 15 09:46:53 Tower kernel: R10: ffffc90020ae7c4c R11: fffffffffc5cec29 R12: ffffc90020ae7ca0
Nov 15 09:46:53 Tower kernel: R13: ffffc90020ae7c9c R14: 0000000000000001 R15: 0000000000200000
Nov 15 09:46:53 Tower kernel: FS:  00001494da9d3740(0000) GS:ffff88889fa00000(0000) knlGS:0000000000000000
Nov 15 09:46:53 Tower kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Nov 15 09:46:53 Tower kernel: CR2: 0000200000000010 CR3: 0000000809ca4001 CR4: 00000000001606e0

 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.