February 9, 20251 yr Hello everyone, I am running Unraid 7.0.0, and recently, the system has been frequently disconnecting without any apparent reason. Simply unplugging and plugging back the network cable does not work; the only way to restore the system is by performing a forced restart using the power button. There are no specific details in the log files to indicate the cause of the disconnections. The most recent incident occurred last night at around 11:00 PM, and the system only recovered after 8:00 AM today (UTC+8 time zone). Can anyone help me diagnose where the problem lies? I have attached the content from "Tools - Diagnostics - Download". Please help me analyze the cause and let me know what I should do next. Thank you in advance! tower-diagnostics-20250209-1039.zip
February 9, 20251 yr Community Expert enable ssh to see if you can ping and terminal conect to it when it becomes unresponsive. You may need to enbale a syslog server and see what its saying before a crash... Quote Feb 9 08:10:56 Tower kernel: i915 0000:00:02.0: [drm] [ENCODER:121:DDI D/PHY D] is disabled/in DSI mode with an ungated DDI clock, gate it Feb 9 08:10:56 Tower kernel: i915 0000:00:02.0: [drm] Finished loading DMC firmware i915/kbl_dmc_ver1_04.bin (v1.4) Feb 9 08:10:56 Tower kernel: [drm] Initialized i915 1.6.0 20201103 for 0000:00:02.0 on minor 0 Feb 9 08:10:56 Tower kernel: ACPI: video: Video Device [GFX0] (multi-head: yes rom: no post: no) Feb 9 08:10:56 Tower kernel: input: Video Bus as /devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A08:00/LNXVIDEO:00/input/input4 Feb 9 08:10:56 Tower kernel: i915 0000:00:02.0: [drm] Cannot find any crtc or sizes Feb 9 08:10:56 Tower kernel: ------------[ cut here ]------------ Feb 9 08:10:56 Tower kernel: RPM raw-wakeref not held Feb 9 08:10:56 Tower kernel: i915 0000:00:02.0: [drm] Cannot find any crtc or sizes Feb 9 08:10:56 Tower kernel: WARNING: CPU: 11 PID: 4173 at drivers/gpu/drm/i915/intel_runtime_pm.h:121 assert_rpm_wakelock_held+0x2d/0x58 [i915] Feb 9 08:10:56 Tower kernel: Modules linked in: kvmgt(+) mdev i915 drm_buddy ttm i2c_algo_bit drm_display_helper drm_kms_helper drm intel_gtt agpgart ip6table_filter ip6_tables iptable_filter ip_tables x_tables efivarfs bridge stp llc bonding tls e1000e r8169 realtek dm_mod x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel sha512_ssse3 sha256_ssse3 sha1_ssse3 aesni_intel crypto_simd cryptd rapl mei_hdcp mei_pxp intel_cstate wmi_bmof intel_wmi_thunderbolt i2c_i801 nvme i2c_smbus mei_me intel_uncore i2c_core nvme_core mei ahci libahci tpm_crb tpm_tis tpm_tis_core processor_thermal_device_pci_legacy video processor_thermal_device tpm processor_thermal_rfim backlight processor_thermal_mbox processor_thermal_rapl intel_rapl_common intel_soc_dts_iosf int3400_thermal int3403_thermal acpi_thermal_rel int340x_thermal_zone intel_pch_thermal iosf_mbi wmi acpi_pad button acpi_tad [last unloaded: e1000e] Feb 9 08:10:56 Tower kernel: CPU: 11 PID: 4173 Comm: modprobe Tainted: G U 6.6.68-Unraid #1 Feb 9 08:10:56 Tower kernel: Hardware name: LENOVO 11DXCTO1WW/316F, BIOS M2WKT52A 01/06/2022 Feb 9 08:10:56 Tower kernel: RIP: 0010:assert_rpm_wakelock_held+0x2d/0x58 [i915] Feb 9 08:10:56 Tower kernel: Code: 40 8a 7f 11 e8 bc ff ff ff 66 85 db 75 1e 80 3d 3a f1 16 00 00 75 15 48 c7 c7 10 6e bb a0 c6 05 2a f1 16 00 01 e8 83 0d 73 e0 <0f> 0b c1 eb 10 75 1e 80 3d 16 f1 16 00 00 75 15 48 c7 c7 2a 6e bb Feb 9 08:10:56 Tower kernel: RSP: 0018:ffffc900005f7ce8 EFLAGS: 00010286 Feb 9 08:10:56 Tower kernel: RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000027 Feb 9 08:10:56 Tower kernel: RDX: 0000000082440630 RSI: ffffffff822451fd RDI: 00000000ffffffff Feb 9 08:10:56 Tower kernel: RBP: 0000000000140268 R08: 0000000000000000 R09: ffffffff82440630 Feb 9 08:10:56 Tower kernel: R10: 00007fffffffffff R11: 00000000000320c0 R12: ffffc900005f7d98 Feb 9 08:10:56 Tower kernel: R13: 0000000000000000 R14: ffff8881041e8000 R15: 0000000000000000 Feb 9 08:10:56 Tower kernel: FS: 000014a8f5959440(0000) GS:ffff88883e6c0000(0000) knlGS:0000000000000000 Feb 9 08:10:56 Tower kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Feb 9 08:10:56 Tower kernel: CR2: 000014a8f5733000 CR3: 0000000167b98002 CR4: 00000000003706e0 Feb 9 08:10:56 Tower kernel: Call Trace: Feb 9 08:10:56 Tower kernel: <TASK> Feb 9 08:10:56 Tower kernel: ? __warn+0x99/0x11a Feb 9 08:10:56 Tower kernel: ? report_bug+0xd9/0x153 Feb 9 08:10:56 Tower kernel: ? assert_rpm_wakelock_held+0x2d/0x58 [i915] Feb 9 08:10:56 Tower kernel: ? handle_bug+0x53/0x7c Feb 9 08:10:56 Tower kernel: ? exc_invalid_op+0x13/0x60 Feb 9 08:10:56 Tower kernel: ? asm_exc_invalid_op+0x16/0x20 Feb 9 08:10:56 Tower kernel: ? assert_rpm_wakelock_held+0x2d/0x58 [i915] ### [PREVIOUS LINE REPEATED 1 TIMES] ### Feb 9 08:10:56 Tower kernel: fwtable_read32+0x21/0xc0 [i915] Feb 9 08:10:56 Tower kernel: handle_mmio+0x4f/0x67 [i915] Feb 9 08:10:56 Tower kernel: iterate_generic_mmio+0x5a4f/0x5ac3 [i915] Feb 9 08:10:56 Tower kernel: intel_gvt_iterate_mmio_table+0x12/0x194e [i915] Feb 9 08:10:56 Tower kernel: intel_gvt_init_device+0x16c/0x236 [i915] Feb 9 08:10:56 Tower kernel: ? __pfx_handle_mmio+0x10/0x10 [i915] Feb 9 08:10:56 Tower kernel: intel_gvt_set_ops+0x5e/0x82 [i915] Feb 9 08:10:56 Tower kernel: ? __pfx_kvmgt_init+0x10/0x10 [kvmgt] Feb 9 08:10:56 Tower kernel: kvmgt_init+0x12/0xff0 [kvmgt] Feb 9 08:10:56 Tower kernel: do_one_initcall+0x80/0x1a4 Feb 9 08:10:56 Tower kernel: ? kmalloc_trace+0x43/0x52 Feb 9 08:10:56 Tower kernel: do_init_module+0x60/0x20d Feb 9 08:10:56 Tower kernel: __do_sys_init_module+0xbc/0xfd Feb 9 08:10:56 Tower kernel: do_syscall_64+0x57/0x7b Feb 9 08:10:56 Tower kernel: entry_SYSCALL_64_after_hwframe+0x78/0xe2 Feb 9 08:10:56 Tower kernel: RIP: 0033:0x14a8f5a7d9aa Feb 9 08:10:56 Tower kernel: Code: 48 8b 0d 61 a4 0d 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 49 89 ca b8 af 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 2e a4 0d 00 f7 d8 64 89 01 48 Feb 9 08:10:56 Tower kernel: RSP: 002b:00007fffd41b1ca8 EFLAGS: 00000202 ORIG_RAX: 00000000000000af Feb 9 08:10:56 Tower kernel: RAX: ffffffffffffffda RBX: 000000000042b420 RCX: 000014a8f5a7d9aa Feb 9 08:10:56 Tower kernel: RDX: 00000000004204b3 RSI: 00000000000a3df0 RDI: 000014a8f5690010 Feb 9 08:10:56 Tower kernel: RBP: 000014a8f5690010 R08: 0000000000000001 R09: 0000000000000000 Feb 9 08:10:56 Tower kernel: R10: 0000000000000071 R11: 0000000000000202 R12: 00000000004204b3 Feb 9 08:10:56 Tower kernel: R13: 0000000000040000 R14: 000000000042b530 R15: 0000000000000000 Feb 9 08:10:56 Tower kernel: </TASK> Feb 9 08:10:56 Tower kernel: ---[ end trace 0000000000000000 ]--- Feb 9 08:10:56 Tower kernel: ------------[ cut here ]------------ Feb 9 08:10:56 Tower kernel: RPM wakelock ref not held during HW access Feb 9 08:10:56 Tower kernel: WARNING: CPU: 11 PID: 4173 at drivers/gpu/drm/i915/intel_runtime_pm.h:129 assert_rpm_wakelock_held+0x50/0x58 [i915] Feb 9 08:10:56 Tower kernel: Modules linked in: kvmgt(+) mdev i915 drm_buddy ttm i2c_algo_bit drm_display_helper drm_kms_helper drm intel_gtt agpgart ip6table_filter ip6_tables iptable_filter ip_tables x_tables efivarfs bridge stp llc bonding tls e1000e r8169 realtek dm_mod x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel sha512_ssse3 sha256_ssse3 sha1_ssse3 aesni_intel crypto_simd cryptd rapl mei_hdcp mei_pxp intel_cstate wmi_bmof intel_wmi_thunderbolt i2c_i801 nvme i2c_smbus mei_me intel_uncore i2c_core nvme_core mei ahci libahci tpm_crb tpm_tis tpm_tis_core processor_thermal_device_pci_legacy video processor_thermal_device tpm processor_thermal_rfim backlight processor_thermal_mbox processor_thermal_rapl intel_rapl_common intel_soc_dts_iosf int3400_thermal int3403_thermal acpi_thermal_rel int340x_thermal_zone intel_pch_thermal iosf_mbi wmi acpi_pad button acpi_tad [last unloaded: e1000e] Feb 9 08:10:56 Tower kernel: CPU: 11 PID: 4173 Comm: modprobe Tainted: G U W 6.6.68-Unraid #1 Feb 9 08:10:56 Tower kernel: Hardware name: LENOVO 11DXCTO1WW/316F, BIOS M2WKT52A 01/06/2022 Feb 9 08:10:56 Tower kernel: RIP: 0010:assert_rpm_wakelock_held+0x50/0x58 [i915] Feb 9 08:10:56 Tower kernel: Code: 00 01 e8 83 0d 73 e0 0f 0b c1 eb 10 75 1e 80 3d 16 f1 16 00 00 75 15 48 c7 c7 2a 6e bb a0 c6 05 06 f1 16 00 01 e8 60 0d 73 e0 <0f> 0b 5b c3 cc cc cc cc 90 90 90 90 90 90 90 90 90 90 90 90 90 90 Feb 9 08:10:56 Tower kernel: RSP: 0018:ffffc900005f7ce8 EFLAGS: 00010286 Feb 9 08:10:56 Tower kernel: RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000027 Feb 9 08:10:56 Tower kernel: RDX: 0000000082440630 RSI: ffffffff822451fd RDI: 00000000ffffffff Feb 9 08:10:56 Tower kernel: RBP: 0000000000140268 R08: 0000000000000000 R09: ffffffff82440630 Feb 9 08:10:56 Tower kernel: R10: 00007fffffffffff R11: 00000000000320c0 R12: ffffc900005f7d98 Feb 9 08:10:56 Tower kernel: R13: 0000000000000000 R14: ffff8881041e8000 R15: 0000000000000000 Feb 9 08:10:56 Tower kernel: FS: 000014a8f5959440(0000) GS:ffff88883e6c0000(0000) knlGS:0000000000000000 Feb 9 08:10:56 Tower kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Feb 9 08:10:56 Tower kernel: CR2: 000014a8f5733000 CR3: 0000000167b98002 CR4: 00000000003706e0 Feb 9 08:10:56 Tower kernel: Call Trace: Feb 9 08:10:56 Tower kernel: <TASK> Feb 9 08:10:56 Tower kernel: ? __warn+0x99/0x11a Feb 9 08:10:56 Tower kernel: ? report_bug+0xd9/0x153 Feb 9 08:10:56 Tower kernel: ? assert_rpm_wakelock_held+0x50/0x58 [i915] Feb 9 08:10:56 Tower kernel: ? handle_bug+0x53/0x7c Feb 9 08:10:56 Tower kernel: ? exc_invalid_op+0x13/0x60 Feb 9 08:10:56 Tower kernel: ? asm_exc_invalid_op+0x16/0x20 Feb 9 08:10:56 Tower kernel: ? assert_rpm_wakelock_held+0x50/0x58 [i915] ### [PREVIOUS LINE REPEATED 1 TIMES] ### Feb 9 08:10:56 Tower kernel: fwtable_read32+0x21/0xc0 [i915] Feb 9 08:10:56 Tower kernel: handle_mmio+0x4f/0x67 [i915] Feb 9 08:10:56 Tower kernel: iterate_generic_mmio+0x5a4f/0x5ac3 [i915] Feb 9 08:10:56 Tower kernel: intel_gvt_iterate_mmio_table+0x12/0x194e [i915] Feb 9 08:10:56 Tower kernel: intel_gvt_init_device+0x16c/0x236 [i915] Feb 9 08:10:56 Tower kernel: ? __pfx_handle_mmio+0x10/0x10 [i915] Feb 9 08:10:56 Tower kernel: intel_gvt_set_ops+0x5e/0x82 [i915] Feb 9 08:10:56 Tower kernel: ? __pfx_kvmgt_init+0x10/0x10 [kvmgt] Feb 9 08:10:56 Tower kernel: kvmgt_init+0x12/0xff0 [kvmgt] Feb 9 08:10:56 Tower kernel: do_one_initcall+0x80/0x1a4 Feb 9 08:10:56 Tower kernel: ? kmalloc_trace+0x43/0x52 Feb 9 08:10:56 Tower kernel: do_init_module+0x60/0x20d Feb 9 08:10:56 Tower kernel: __do_sys_init_module+0xbc/0xfd Feb 9 08:10:56 Tower kernel: do_syscall_64+0x57/0x7b Feb 9 08:10:56 Tower kernel: entry_SYSCALL_64_after_hwframe+0x78/0xe2 Feb 9 08:10:56 Tower kernel: RIP: 0033:0x14a8f5a7d9aa Feb 9 08:10:56 Tower kernel: Code: 48 8b 0d 61 a4 0d 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 49 89 ca b8 af 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 2e a4 0d 00 f7 d8 64 89 01 48 Feb 9 08:10:56 Tower kernel: RSP: 002b:00007fffd41b1ca8 EFLAGS: 00000202 ORIG_RAX: 00000000000000af Feb 9 08:10:56 Tower kernel: RAX: ffffffffffffffda RBX: 000000000042b420 RCX: 000014a8f5a7d9aa Feb 9 08:10:56 Tower kernel: RDX: 00000000004204b3 RSI: 00000000000a3df0 RDI: 000014a8f5690010 Feb 9 08:10:56 Tower kernel: RBP: 000014a8f5690010 R08: 0000000000000001 R09: 0000000000000000 Feb 9 08:10:56 Tower kernel: R10: 0000000000000071 R11: 0000000000000202 R12: 00000000004204b3 Feb 9 08:10:56 Tower kernel: R13: 0000000000040000 R14: 000000000042b530 R15: 0000000000000000 Feb 9 08:10:56 Tower kernel: </TASK> Feb 9 08:10:56 Tower kernel: ---[ end trace 0000000000000000 ]--- Feb 9 08:10:56 Tower kernel: i915 0000:00:02.0: [drm] Cannot find any crtc or sizes Feb 9 08:10:56 Tower kernel: i915 0000:00:02.0: Direct firmware load for i915/gvt/vid_0x8086_did_0x9bc8_rid_0x03.golden_hw_state failed with error -2 Feb 9 10:22:53 Tower nginx: 2025/02/09 10:22:53 [alert] 5978#5978: worker process 10021 exited on signal 6 You appear to be having intel graphics issues and call traces for encoding...
February 12, 20251 yr Author Thank you for your guidance. As per your suggestion, I enabled "Settings -> System Log Server -> Mirror system log to flash" and set it to Yes. Last night, my Unraid server disconnected again. I checked my router and confirmed that there was no IP address assigned to the Unraid server, and I couldn't ping it. This suggests that the system completely froze. Below is my diagnostics file for analysis. Additionally, I suspect that the issue might be caused by an IPTV service Docker container (pulled from youshandefeiyang/allinone). Over the past few days, I haven't experienced any issues when this Docker was disabled. However, after enabling it last night, the system lost connection one hour later, right when I was watching IPTV. I cannot confirm whether this Docker is the root cause, and I would appreciate any insights or suggestions. Thanks in advance! tower-diagnostics-20250212-0933.zip
February 12, 20251 yr Community Expert NIC issues: Feb 11 20:11:16 Tower kernel: e1000e 0000:00:1f.6 eth0: Detected Hardware Unit Hang: Feb 11 20:11:16 Tower kernel: TDH <c7> Feb 11 20:11:16 Tower kernel: TDT <db> Feb 11 20:11:16 Tower kernel: next_to_use <db> Feb 11 20:11:16 Tower kernel: next_to_clean <c6> Feb 11 20:11:16 Tower kernel: buffer_info[next_to_clean]: Feb 11 20:11:16 Tower kernel: time_stamp <10b118b55> Feb 11 20:11:16 Tower kernel: next_to_watch <c7> Feb 11 20:11:16 Tower kernel: jiffies <10b119580> Feb 11 20:11:16 Tower kernel: next_to_watch.status <0> Feb 11 20:11:16 Tower kernel: MAC Status <40080083> Feb 11 20:11:16 Tower kernel: PHY Status <796d> Feb 11 20:11:16 Tower kernel: PHY 1000BASE-T Status <3800> Feb 11 20:11:16 Tower kernel: PHY Extended Status <3000> Feb 11 20:11:16 Tower kernel: PCI Status <10> Feb 11 20:11:18 Tower kernel: e1000e 0000:00:1f.6 eth0: Detected Hardware Unit Hang:
February 12, 20251 yr Author Thank you for your guidance. I would like to know how I can resolve this NIC issue. From the logs, can we rule out poor cable connection? Since I didn't touch the network cable or router when the issue occurred, could it be a driver issue with the network card? Also, does the log indicate that the e1000e network card is problematic? The e1000e network card is a virtual network card, right? So, could it be that the issue with the virtual machine is causing the network problem? Am I understanding this correctly?
February 12, 20251 yr Community Expert In my experience, this tends to me more a kernel/driver issue with those Intel NICs and some boards, a BIOS update might help, as well as a newer (or older Unraid release), failing that, you can try using an add-on NIC.
March 6, 20251 yr Author Solution I can basically conclude that the issue is with the youshandefeiyang/allinone Docker container. When it is enabled, it randomly becomes unresponsive (I can't determine whether it's a network disconnection or a system crash, but a reboot restores it). I have disabled it for two weeks now, and the issue has not occurred again.
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.