Jump to content

Error on boot


Go to solution Solved by JorgeB,

Recommended Posts

Posted (edited)

Greetings, I've been using unraid for like 2 months and I've been experiencing crashes every couple of days. I've been reading all over what could be causing my issues but havent found anything and honestly I just restart and continue my life in pain.

Well that lasted until 1 hour ago, my server crashed and wont boot, couldnt get logs because server just died, cant access the share to check for logs  and all I see is this (check attached image) when booting.

I know is not ideal, and all I ask is if this error brings something up in your experience. I'm tempted to just try a different OS, I dont care about the data and I cant make it a week in unraid. Maybe its just something wrong with the hardware but dont really have spares to test, shit.

 

This is my hardware if it helps, 2 nvme cache, 3 10tb disks WD Red (1 parity).

image.png.4a0ca8d7cd4b9773108d1ff7256fe1d3.png

 

IMG_0710.jpg

Edited by Ceps
Link to comment

I couldnt try that this time, because at some point a couple of weeks back I bought a new flash drive and moved my install / license to it, problems continued after that, when I tried today to install unraid on my old flash drive I forgot it was black listed 😅.

I spent all this time since I made this post running memtest on each memory module and all passed. After many restarts I got finally in, like no change, just resetting bios and restarting, trying some XMP profiles, removing modules, different memory also that I borrowed, eventually, after another reset bios to defaults and restart with my actual memory I got in. I ran to get into the logs and they were gone. I only have today (attached).

Attached my docker containers and I also have a single VM.

image.png.06b33c3aa17ba4f75ff57b8366b3bafc.png

image.thumb.png.e26b76e5ed4cea518ddd1e09f383a16d.png

image.png.ccb9ebff6af421d4bb24618f7e5002f7.png

unraid-syslog-20240518-0537.zip

Link to comment
3 hours ago, Ceps said:

I couldnt try that this time, because at some point a couple of weeks back I bought a new flash drive and moved my install / license to it, problems continued after that, when I tried today to install unraid on my old flash drive I forgot it was black listed

You could still use it to see if it boots.

 

3 hours ago, Ceps said:

After many restarts I got finally in,

SO it's working now? Diags look OK to me for now.

Link to comment

Ok, it crashed again, had to restart (with the physical button). When checking the logs I noticed a dropdown I ignored before with a different file which contains everything, I said before I thought I lost the logs 😞.

image.png.0c7d1486e5d61fcfce2292e5ee403688.png

 

Attached the logs file. So this last crash, I guess somewhere around here the server died, 10:22 was me restarting the server.

 

May 18 22:48:45 unRaid webGUI: Successful login user root from 192.168.1.243
May 18 23:06:07 unRaid kernel: mce: [Hardware Error]: Machine check events logged
May 18 23:13:22 unRaid kernel: mce: [Hardware Error]: Machine check events logged
May 18 23:19:24 unRaid kernel: mce: [Hardware Error]: Machine check events logged
May 18 23:19:24 unRaid kernel: mce: [Hardware Error]: Machine check events logged
May 19 04:35:51 unRaid kernel: mce: [Hardware Error]: Machine check events logged
May 19 04:40:01 unRaid root: Fix Common Problems Version 2024.05.04
May 19 06:03:46 unRaid kernel: mce: [Hardware Error]: Machine check events logged
May 19 06:06:20 unRaid kernel: mce: [Hardware Error]: Machine check events logged
May 19 10:22:27 unRaid root: Delaying execution of fix common problems scan for 10 minutes
May 19 10:22:27 unRaid emhttpd: Starting services...
May 19 10:22:27 unRaid emhttpd: shcmd (54): chmod 0777 '/mnt/user/documents'

 

syslog-192.168.1.6.log.txt.zip

Link to comment
Posted (edited)

New crash, cant get in yet. This is the output when booting unRaid.

IMG_0715.thumb.jpg.3a09b34e3893bb0c246f74f17875231c.jpg

 

Update: After some time I was able to boot again.

Logs around the crash:

May 19 10:23:02 unRaid network: reload service: nginx
May 19 10:23:02 unRaid nginx: 2024/05/19 10:23:02 [alert] 6765#6765: *111 open socket #19 left in connection 10
May 19 10:23:02 unRaid nginx: 2024/05/19 10:23:02 [alert] 6765#6765: aborting
May 19 10:23:30 unRaid kernel: x86/split lock detection: #AC: CPU 0/KVM/8776 took a split_lock trap at address: 0x733c4014
May 19 10:23:30 unRaid kernel: x86/split lock detection: #AC: CPU 1/KVM/8777 took a split_lock trap at address: 0x733c4014
May 19 10:23:30 unRaid kernel: x86/split lock detection: #AC: CPU 3/KVM/8779 took a split_lock trap at address: 0x733c4014
May 19 10:23:30 unRaid kernel: x86/split lock detection: #AC: CPU 2/KVM/8778 took a split_lock trap at address: 0x733c4014
May 19 10:23:31 unRaid kernel: x86/split lock detection: #AC: CPU 7/KVM/8783 took a split_lock trap at address: 0x733c4014
May 19 10:23:31 unRaid kernel: x86/split lock detection: #AC: CPU 5/KVM/8781 took a split_lock trap at address: 0x733c4014
May 19 10:23:31 unRaid kernel: x86/split lock detection: #AC: CPU 6/KVM/8782 took a split_lock trap at address: 0x733c4014
May 19 10:23:31 unRaid kernel: x86/split lock detection: #AC: CPU 4/KVM/8780 took a split_lock trap at address: 0x733c4014
May 19 10:32:00 unRaid root: Fix Common Problems Version 2024.05.04
May 19 12:12:49 unRaid kernel: mce_notify_irq: 8 callbacks suppressed
May 19 12:12:49 unRaid kernel: mce: [Hardware Error]: Machine check events logged
May 19 12:18:01 unRaid kernel: mce: [Hardware Error]: Machine check events logged
May 19 14:19:45 unRaid kernel: mce: [Hardware Error]: Machine check events logged
May 19 14:40:42 unRaid kernel: mce: [Hardware Error]: Machine check events logged
May 19 14:53:49 unRaid kernel: invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
May 19 14:53:49 unRaid kernel: CPU: 8 PID: 14923 Comm: smartctl_type Tainted: P           O       6.1.79-Unraid #1
May 19 14:53:49 unRaid kernel: Hardware name: ASUS System Product Name/ProArt Z790-CREATOR WIFI, BIOS 2202 04/17/2024
May 19 14:53:49 unRaid kernel: RIP: 0010:cpumask_any_but+0x2c/0x34
May 19 14:53:49 unRaid kernel: Code: c0 48 89 fd 53 89 f3 8b 35 ba dd 34 01 89 c2 48 89 ef e8 96 96 3d 00 39 05 aa dd 34 01 89 c2 76 08 39 c3 75 04 ff c0 eb de 5b <89> d0 5d c3 cc cc cc cc 0f 1f 44 00 00 55 bd 1f 00 00 00 53 48 89
May 19 14:53:49 unRaid kernel: RSP: 0000:ffffc9000f30fcd8 EFLAGS: 00010246
May 19 14:53:49 unRaid kernel: RAX: 000000000000001c RBX: ffff88816c433300 RCX: 0000000000000009
May 19 14:53:49 unRaid kernel: RDX: 000000000000001c RSI: 000000000000001c RDI: ffff88816c433738
May 19 14:53:49 unRaid kernel: RBP: ffff88816c433738 R08: 0000000000000001 R09: 0000000000000059
May 19 14:53:49 unRaid kernel: R10: ffff8881068c5008 R11: ffff8881068c500c R12: ffff88816c433738
May 19 14:53:49 unRaid kernel: R13: 0000000000000008 R14: 0000000000000000 R15: 0000000000000000
May 19 14:53:49 unRaid kernel: FS:  0000148a3496b640(0000) GS:ffff88a03f200000(0000) knlGS:0000000000000000
May 19 14:53:49 unRaid kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
May 19 14:53:49 unRaid kernel: CR2: 0000148a34ba4280 CR3: 0000001661ca4000 CR4: 0000000000752ee0
May 19 14:53:49 unRaid kernel: PKRU: 55555554
May 19 14:53:49 unRaid kernel: Call Trace:
May 19 14:53:49 unRaid kernel: <TASK>
May 19 14:53:49 unRaid kernel: ? __die_body+0x1a/0x5c
May 19 14:53:49 unRaid kernel: ? die+0x30/0x49
May 19 14:53:49 unRaid kernel: ? do_trap+0x7b/0xfe
May 19 14:53:49 unRaid kernel: ? cpumask_any_but+0x2c/0x34
May 19 14:53:49 unRaid kernel: ? cpumask_any_but+0x2c/0x34
May 19 14:53:49 unRaid kernel: ? do_error_trap+0x6e/0x98
May 19 14:53:49 unRaid kernel: ? cpumask_any_but+0x2c/0x34
May 19 14:53:49 unRaid kernel: ? exc_invalid_op+0x4c/0x60
May 19 14:53:49 unRaid kernel: ? cpumask_any_but+0x2c/0x34
May 19 14:53:49 unRaid kernel: ? asm_exc_invalid_op+0x16/0x20
May 19 14:53:49 unRaid kernel: ? cpumask_any_but+0x2c/0x34
May 19 14:53:49 unRaid kernel: flush_tlb_mm_range+0xb0/0x111
May 19 14:53:49 unRaid kernel: ptep_clear_flush+0x3c/0x45
May 19 14:53:49 unRaid kernel: wp_page_copy+0x36d/0x4a3
May 19 14:53:49 unRaid kernel: __handle_mm_fault+0x71c/0xcf9
May 19 14:53:49 unRaid kernel: handle_mm_fault+0x13d/0x20f
May 19 14:53:49 unRaid kernel: do_user_addr_fault+0x2c3/0x48d
May 19 14:53:49 unRaid kernel: exc_page_fault+0xfb/0x11d
May 19 14:53:49 unRaid kernel: asm_exc_page_fault+0x22/0x30
May 19 14:53:49 unRaid kernel: RIP: 0033:0x148a3863f21c
May 19 14:53:49 unRaid kernel: Code: 1f 80 00 00 00 00 48 8b 08 8b 50 08 4c 01 f9 48 83 fa 26 74 0a 48 83 fa 08 0f 85 cb 19 00 00 48 8b 50 10 48 83 c0 18 4c 01 fa <48> 89 11 48 39 d8 72 d4 4d 8b 93 e8 01 00 00 4d 85 d2 0f 84 fc 0a
May 19 14:53:49 unRaid kernel: RSP: 002b:00007ffc7121ad40 EFLAGS: 00010202
May 19 14:53:49 unRaid kernel: RAX: 0000148a34a8a158 RBX: 0000148a34aafeb0 RCX: 0000148a34ba4280
May 19 14:53:49 unRaid kernel: RDX: 0000148a34ab3970 RSI: 0000148a38664ab0 RDI: 0000148a34a87ca8
May 19 14:53:49 unRaid kernel: RBP: 00007ffc7121ae40 R08: 0000148a34ab15c0 R09: 0000148a34ab2238
May 19 14:53:49 unRaid kernel: R10: 0000000000000001 R11: 0000148a34bdc090 R12: 0000000000000000
May 19 14:53:49 unRaid kernel: R13: 00007ffc7121add0 R14: 0000148a34a87000 R15: 0000148a34a87000
May 19 14:53:49 unRaid kernel: </TASK>
May 19 14:53:49 unRaid kernel: Modules linked in: xt_CHECKSUM ipt_REJECT nf_reject_ipv4 ip6table_mangle ip6table_nat iptable_mangle vhost_net tun vhost vhost_iotlb tap veth xt_nat xt_tcpudp xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xt_addrtype br_netfilter xfs md_mod zfs(PO) zunicode(PO) zzstd(O) zlua(O) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) ip6table_filter ip6_tables iptable_filter ip_tables x_tables efivarfs bridge stp llc bonding tls igc atlantic i915 intel_rapl_msr intel_rapl_common x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel iosf_mbi kvm drm_buddy i2c_algo_bit ttm drm_display_helper drm_kms_helper crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel sha512_ssse3 sha256_ssse3 drm sha1_ssse3 aesni_intel btusb btrtl crypto_simd btbcm btintel input_leds cryptd rapl bluetooth intel_cstate mei_hdcp mei_pxp wmi_bmof intel_gtt joydev led_class i2c_i801 ecdh_generic ecc i2c_smbus
May 19 14:53:49 unRaid kernel: agpgart nvme ahci mei_me intel_uncore thunderbolt i2c_core syscopyarea nvme_core mei libahci sysfillrect vmd sysimgblt video thermal fan fb_sys_fops tpm_crb tpm_tis tpm_tis_core wmi tpm intel_pmc_core backlight acpi_pad acpi_tad button unix [last unloaded: igc]
May 19 14:53:49 unRaid kernel: ---[ end trace 0000000000000000 ]---
May 19 14:53:49 unRaid kernel: RIP: 0010:cpumask_any_but+0x2c/0x34
May 19 14:53:49 unRaid kernel: Code: c0 48 89 fd 53 89 f3 8b 35 ba dd 34 01 89 c2 48 89 ef e8 96 96 3d 00 39 05 aa dd 34 01 89 c2 76 08 39 c3 75 04 ff c0 eb de 5b <89> d0 5d c3 cc cc cc cc 0f 1f 44 00 00 55 bd 1f 00 00 00 53 48 89
May 19 14:53:49 unRaid kernel: RSP: 0000:ffffc9000f30fcd8 EFLAGS: 00010246
May 19 14:53:49 unRaid kernel: RAX: 000000000000001c RBX: ffff88816c433300 RCX: 0000000000000009
May 19 14:53:49 unRaid kernel: RDX: 000000000000001c RSI: 000000000000001c RDI: ffff88816c433738
May 19 14:53:49 unRaid kernel: RBP: ffff88816c433738 R08: 0000000000000001 R09: 0000000000000059
May 19 14:53:49 unRaid kernel: R10: ffff8881068c5008 R11: ffff8881068c500c R12: ffff88816c433738
May 19 14:53:49 unRaid kernel: R13: 0000000000000008 R14: 0000000000000000 R15: 0000000000000000
May 19 14:53:49 unRaid kernel: FS:  0000148a3496b640(0000) GS:ffff88a03f200000(0000) knlGS:0000000000000000
May 19 14:53:49 unRaid kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
May 19 14:53:49 unRaid kernel: CR2: 0000148a34ba4280 CR3: 0000001661ca4000 CR4: 0000000000752ee0
May 19 14:53:49 unRaid kernel: PKRU: 55555554
May 19 14:53:49 unRaid kernel: note: smartctl_type[14923] exited with preempt_count 2
May 19 14:53:58 unRaid kernel: mce: [Hardware Error]: Machine check events logged
May 19 14:54:03 unRaid kernel: mce: [Hardware Error]: Machine check events logged
May 19 19:05:35 unRaid root: Delaying execution of fix common problems scan for 10 minutes
May 19 19:05:35 unRaid emhttpd: Starting services...

 

Edited by Ceps
Link to comment
  • Solution
18 hours ago, Ceps said:
May 18 23:06:07 unRaid kernel: mce: [Hardware Error]: Machine check events logged
May 18 23:13:22 unRaid kernel: mce: [Hardware Error]: Machine check events logged
May 18 23:19:24 unRaid kernel: mce: [Hardware Error]: Machine check events logged
May 18 23:19:24 unRaid kernel: mce: [Hardware Error]: Machine check events logged
May 19 04:35:51 unRaid kernel: mce: [Hardware Error]: Machine check events logged

These suggest a hardware problem.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...