Jump to content

Server crashing


lm699

Recommended Posts

Hi everybody,

 

My new unraid server crash after few hours... SSH & HTTP not responding. I have to force restart (not properly) to have to access again.

 

Hardware configuration : 

+ 1x AMD Ryzen 5 3600 (3.6 GHz / 4.2 GHz)

+ 1x ASRock A520M-ITX/ac

+ 1x Corsair Vengeance LPX Series Low Profile 32 Go (2x 16 Go) DDR4 2666 MHz CL16

+ 1x Fractal Design Node 304 Noir

+ 1x Bios update for AMD Ryzen 5

+ 1x Corsair CX450M 80PLUS Bronze

+ 1x Noctua NH-L9x65 SE-AM4

+ 1x Crucial P3 500 Go

+ 2x Western digital 10To RedPlus

 

Please see attached files;

 

EDIT 15h34

I have restarted server in safemode and retrying parity check

I was started syslog from terminal (ssh), and I have new warnings 

Feb 12 14:53:27 Tower kernel: php-fpm[29564]: segfault at 548cc74 ip 00000000008a38dd sp 00007ffeb4e4e6a0 error 4 in php-fpm[600000+347000]
Feb 12 14:53:27 Tower kernel: Code: 80 f5 41 01 48 89 ea 48 8d 35 2f f2 ff ff e8 1a f0 ff ff e9 d8 fe ff ff 0f 1f 44 00 00 53 48 8b 1f 80 3b 02 74 4f 48 8b 7b 08 <f6> 47 04 40 74 15 48 83 7b 10 00 74 21 f6 43 07 02 74 29 5b c3 66
Feb 12 14:53:27 Tower  php-fpm[1426]: [WARNING] [pool www] child 29564 exited on signal 11 (SIGSEGV) after 11.011545 seconds from start
Feb 12 14:53:38 Tower kernel: php-fpm[29714]: segfault at 548cc74 ip 00000000008a38dd sp 00007ffeb4e4e6a0 error 4 in php-fpm[600000+347000]
Feb 12 14:53:38 Tower kernel: Code: 80 f5 41 01 48 89 ea 48 8d 35 2f f2 ff ff e8 1a f0 ff ff e9 d8 fe ff ff 0f 1f 44 00 00 53 48 8b 1f 80 3b 02 74 4f 48 8b 7b 08 <f6> 47 04 40 74 15 48 83 7b 10 00 74 21 f6 43 07 02 74 29 5b c3 66
Feb 12 14:53:38 Tower  php-fpm[1426]: [WARNING] [pool www] child 29714 exited on signal 11 (SIGSEGV) after 11.011091 seconds from start

 

now, responses from server (http & ssh) are very long..

 

 

Another traces

Feb 12 15:53:16 Tower kernel: RIP: 0033:0x8a38dd
Feb 12 15:53:16 Tower kernel: Code: Unable to access opcode bytes at RIP 0x8a38b3.
Feb 12 15:53:16 Tower kernel: RSP: 002b:00007ffeb4e4e4f0 EFLAGS: 00010297
Feb 12 15:53:16 Tower kernel: RAX: 00000000008a38d0 RBX: 000000000157fcf0 RCX: 000000000157ee9f
Feb 12 15:53:16 Tower kernel: RDX: 000000000157fbe0 RSI: 0000000000000007 RDI: 000000000548cc70
Feb 12 15:53:16 Tower kernel: RBP: 000000000157e5c0 R08: 0000000000000007 R09: 000000000157fc70
Feb 12 15:53:16 Tower kernel: R10: 991aebed501b255b R11: 000014d3b43e5c50 R12: 000000000157f5a0
Feb 12 15:53:16 Tower kernel: R13: 000000000157e5f8 R14: 00007ffeb4e4e60c R15: 0000000000004f9a
Feb 12 15:53:16 Tower kernel: </TASK>
Feb 12 15:53:17 Tower kernel: BUG: Bad rss-counter state mm:000000004dde862c type:MM_SWAPENTS val:-2

 

Thank in adavance for your help

syslog.rtf tower-diagnostics-20230212-1236.zip

Edited by lm699
Link to comment
42 minutes ago, trurl said:

Please don't put syslog in word processor, it makes it more difficult to read and work with.

Ok sorry, this from TextEditor (standard from MacOS).

 

I'll try to pass Power Supply Idle Control to typical current idle & disable C-States globally.

 

Thank for u help :)

Link to comment

New Bug, WebUI is not usable, please see attached logs :

 

Feb 12 22:52:01 Tower kernel: BUG: unable to handle page fault for address: ffffffff89e1fbb0
Feb 12 22:52:01 Tower kernel: #PF: supervisor read access in kernel mode
Feb 12 22:52:01 Tower kernel: #PF: error_code(0x0000) - not-present page
Feb 12 22:52:01 Tower kernel: PGD 520e067 P4D 520e067 PUD 520f063 PMD 0 
Feb 12 22:52:01 Tower kernel: Oops: 0000 [#27] PREEMPT SMP NOPTI
Feb 12 22:52:01 Tower kernel: CPU: 11 PID: 16388 Comm: php-fpm Tainted: G    B D           5.19.17-Unraid #2
Feb 12 22:52:01 Tower kernel: Hardware name: To Be Filled By O.E.M. A520M-ITX/ac/A520M-ITX/ac, BIOS P2.20 12/27/2022
Feb 12 22:52:01 Tower kernel: RIP: 0010:vfs_getattr_nosec+0x68/0x97
Feb 12 22:52:01 Tower kernel: Code: 20 05 df 07 00 00 89 06 41 f6 42 0d 08 74 08 48 c7 46 10 00 10 00 00 48 c7 46 18 00 10 20 00 49 8b 01 48 8b 78 18 49 8b 42 20 <48> 8b 40 70 48 85 c0 74 14 89 d1 41 81 e0 00 60 00 00 48 89 f2 4c
Feb 12 22:52:01 Tower kernel: RSP: 0018:ffffc90005197e08 EFLAGS: 00010246
Feb 12 22:52:01 Tower kernel: RAX: ffffffff89e1fb40 RBX: ffffc90005197e98 RCX: 0000000000000000
Feb 12 22:52:01 Tower kernel: RDX: 00000000000007ff RSI: ffffc90005197e98 RDI: ffffffff8223dbc0
Feb 12 22:52:01 Tower kernel: RBP: 0000000000004001 R08: 0000000000001800 R09: ffffc90005197e18
Feb 12 22:52:01 Tower kernel: R10: ffff888152c160d8 R11: ffffc90005197e9c R12: 0000000000001800
Feb 12 22:52:01 Tower kernel: R13: 0000000000000005 R14: ffff8881022a3000 R15: 00000000000007ff
Feb 12 22:52:01 Tower kernel: FS:  0000153d016b0380(0000) GS:ffff88881eac0000(0000) knlGS:0000000000000000
Feb 12 22:52:01 Tower kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Feb 12 22:52:01 Tower kernel: CR2: ffffffff89e1fbb0 CR3: 000000017ec40000 CR4: 0000000000350ee0
Feb 12 22:52:01 Tower kernel: Call Trace:
Feb 12 22:52:01 Tower kernel: <TASK>
Feb 12 22:52:01 Tower kernel: vfs_statx+0x79/0xf9
Feb 12 22:52:01 Tower kernel: vfs_fstatat+0x46/0x62
Feb 12 22:52:01 Tower kernel: __do_sys_newfstatat+0x26/0x5c
Feb 12 22:52:01 Tower kernel: do_syscall_64+0x6b/0x81
Feb 12 22:52:01 Tower kernel: entry_SYSCALL_64_after_hwframe+0x63/0xcd
Feb 12 22:52:01 Tower kernel: RIP: 0033:0x153d042f8a1a
Feb 12 22:52:01 Tower kernel: Code: 48 89 f2 b9 00 01 00 00 48 89 fe bf 9c ff ff ff e9 0b 00 00 00 66 2e 0f 1f 84 00 00 00 00 00 90 41 89 ca b8 06 01 00 00 0f 05 <3d> 00 f0 ff ff 77 07 31 c0 c3 0f 1f 40 00 48 8b 15 b1 33 0e 00 f7
Feb 12 22:52:01 Tower kernel: RSP: 002b:00007ffe59ad4878 EFLAGS: 00000206 ORIG_RAX: 0000000000000106
Feb 12 22:52:01 Tower kernel: RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 0000153d042f8a1a
Feb 12 22:52:01 Tower kernel: RDX: 00007ffe59ad4880 RSI: 0000153d0439bf35 RDI: 0000000000000005
Feb 12 22:52:01 Tower kernel: RBP: 0000000000000005 R08: 00007ffe59ad5600 R09: 0000000000000020
Feb 12 22:52:01 Tower kernel: R10: 0000000000001000 R11: 0000000000000206 R12: 00007ffe59ad5600
Feb 12 22:52:01 Tower kernel: R13: 0000000000000020 R14: 0000153d014fb179 R15: 0000000000000000
Feb 12 22:52:01 Tower kernel: </TASK>
Feb 12 22:52:01 Tower kernel: Modules linked in: xt_CHECKSUM ipt_REJECT nf_reject_ipv4 xt_tcpudp ip6table_mangle ip6table_nat iptable_mangle vhost_net tun vhost vhost_iotlb tap xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo xt_addrtype iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 br_netfilter xfs md_mod efivarfs ip6table_filter ip6_tables iptable_filter ip_tables x_tables bridge stp llc bonding tls btusb btrtl btbcm edac_mce_amd edac_core kvm_amd kvm wmi_bmof crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel btintel crypto_simd cryptd bluetooth rapl nvme joydev i2c_piix4 r8169 ecdh_generic ahci nvme_core k10temp ccp ecc realtek i2c_core libahci wmi tpm_crb tpm_tis tpm_tis_core tpm acpi_cpufreq button unix
Feb 12 22:52:01 Tower kernel: CR2: ffffffff89e1fbb0
Feb 12 22:52:01 Tower kernel: ---[ end trace 0000000000000000 ]---
Feb 12 22:52:01 Tower kernel: RIP: 0010:vfs_getattr_nosec+0x68/0x97
Feb 12 22:52:01 Tower kernel: Code: 20 05 df 07 00 00 89 06 41 f6 42 0d 08 74 08 48 c7 46 10 00 10 00 00 48 c7 46 18 00 10 20 00 49 8b 01 48 8b 78 18 49 8b 42 20 <48> 8b 40 70 48 85 c0 74 14 89 d1 41 81 e0 00 60 00 00 48 89 f2 4c
Feb 12 22:52:01 Tower kernel: RSP: 0018:ffffc90001a3fe08 EFLAGS: 00010246
Feb 12 22:52:01 Tower kernel: RAX: ffffffff89e1fb40 RBX: ffffc90001a3fe98 RCX: 0000000000000000
Feb 12 22:52:01 Tower kernel: RDX: 00000000000007ff RSI: ffffc90001a3fe98 RDI: ffffffff8223dbc0
Feb 12 22:52:01 Tower kernel: RBP: 0000000000004001 R08: 0000000000001800 R09: ffffc90001a3fe18
Feb 12 22:52:01 Tower kernel: R10: ffff888152c160d8 R11: ffffc90001a3fe9c R12: 0000000000001800
Feb 12 22:52:01 Tower kernel: R13: 0000000000000007 R14: ffff888102033000 R15: 00000000000007ff
Feb 12 22:52:01 Tower kernel: FS:  0000153d016b0380(0000) GS:ffff88881eac0000(0000) knlGS:0000000000000000
Feb 12 22:52:01 Tower kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Feb 12 22:52:01 Tower kernel: CR2: ffffffff89e1fbb0 CR3: 000000017ec40000 CR4: 0000000000350ee0
Feb 12 22:52:01 Tower nginx: 2023/02/12 22:52:01 [error] 1652#1652: *9138 readv() failed (104: Connection reset by peer) while reading upstream, client: 192.168.1.241, server: , request: "GET /Main HTTP/1.1", upstream: "fastcgi://unix:/var/run/php5-fpm.sock:", host: "192.168.1.20"
Feb 12 22:52:01 Tower  php-fpm[16117]: [WARNING] [pool www] child 16388 exited on signal 9 (SIGKILL) after 0.960930 seconds from start

 

I've tried this :

/etc/rc.d/rc.php-fpm restart
/etc/rc.d/rc.php-fpm reload

 

No effects. I can't join any diag... 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...