Jump to content

unRaid crashed "BUG: kernel NULL pointer dereference, address: 0000000000000000"


Recommended Posts

I was doing some things in Plex (trying to add a cartoon series) and then all of a sudden the server crashed. I could not log into the GUI, but some of the shares still worked. And I could ssh into it.

I managed to grab the dmesg before I did a reboot. I could not do a "powerdown" but I could do a "shutdown -h now". It's doing a parity check as I type this.

 

Any ideas what happened here?

 

[3547852.627297] BUG: kernel NULL pointer dereference, address: 0000000000000000
[3547852.627882] #PF: supervisor write access in kernel mode
[3547852.628434] #PF: error_code(0x0002) - not-present page
[3547852.628975] PGD 0 P4D 0 
[3547852.629511] Oops: 0002 [#1] PREEMPT SMP PTI
[3547852.630051] CPU: 8 PID: 19396 Comm: shfs Tainted: G        W I       5.19.17-Unraid #2
[3547852.630621] Hardware name: Dell Inc. PowerEdge R710/XXXXXX, BIOS 6.6.0 05/22/2018
[3547852.631155] RIP: 0010:__rb_erase_color+0xe7/0x1ca
[3547852.631732] Code: 8b 6b 10 f6 45 00 01 75 2f 4c 8b 75 08 48 89 d8 48 89 ee 31 c9 48 83 c8 01 4c 89 ea 48 89 df 4c 89 73 10 48 89 5d 08 4c 89 f5 <49> 89 06 e8 d8 fe ff ff 2e e8 9a be 79 00 48 8b 45 10 48 85 c0 74
[3547852.632927] RSP: 0018:ffffc9000b057d18 EFLAGS: 00010286
[3547852.633522] RAX: ffff888cc8e3a621 RBX: ffff888cc8e3a620 RCX: 0000000000000000
[3547852.634153] RDX: ffff888370fb0cc8 RSI: ffff888cc8e3a1a0 RDI: ffff888cc8e3a620
[3547852.634759] RBP: 0000000000000000 R08: ffff888acf65f520 R09: ffffc9000b057cd0
[3547852.635361] R10: 00001466f75f7000 R11: 00001466f75f8000 R12: ffffffff811d547a
[3547852.636008] R13: ffff888370fb0cc8 R14: 0000000000000000 R15: ffff888cc8e3a600
[3547852.636646] FS:  000014670d18e6c0(0000) GS:ffff88902fb00000(0000) knlGS:0000000000000000
[3547852.637263] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[3547852.637876] CR2: 0000000000000000 CR3: 0000000855544004 CR4: 00000000000226e0
[3547852.638513] Call Trace:
[3547852.639169]  <TASK>
[3547852.639824]  __do_munmap+0x1c0/0x2e2
[3547852.640466]  mmap_region+0x10d/0x45d
[3547852.641070]  do_mmap+0x3c1/0x42d
[3547852.641664]  vm_mmap_pgoff+0xbb/0x112
[3547852.642255]  ksys_mmap_pgoff+0x138/0x166
[3547852.642835]  do_syscall_64+0x6b/0x81
[3547852.643407]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
[3547852.643971] RIP: 0033:0x14670d7a47f3
[3547852.644526] Code: ef e8 61 b8 ff ff eb e7 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 41 89 ca 41 f7 c1 ff 0f 00 00 75 14 b8 09 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 25 c3 0f 1f 40 00 48 8b 05 d9 95 0d 00 64 c7
[3547852.645726] RSP: 002b:000014670d18db88 EFLAGS: 00000246 ORIG_RAX: 0000000000000009
[3547852.646339] RAX: ffffffffffffffda RBX: 00005641df847b50 RCX: 000014670d7a47f3
[3547852.646954] RDX: 0000000000000003 RSI: 0000000000001000 RDI: 0000000000000000
[3547852.647569] RBP: 00005641df847a20 R08: 00000000ffffffff R09: 0000000000000000
[3547852.648180] R10: 0000000000000022 R11: 0000000000000246 R12: 00000000000000b8
[3547852.648787] R13: 00005641df847a88 R14: 0000000000000000 R15: 000014670d18dc80
[3547852.649398]  </TASK>
[3547852.650000] Modules linked in: vhost_net vhost tap kvm_intel kvm macvlan md_mod tls xt_mark xt_CHECKSUM ipt_REJECT nf_reject_ipv4 ip6table_mangle ip6table_nat iptable_mangle tun vhost_iotlb veth xt_nat xt_tcpudp xt_conntrack nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo xt_addrtype br_netfilter xfs nfsd auth_rpcgss oid_registry lockd grace sunrpc ipmi_devintf iptable_nat xt_MASQUERADE nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 wireguard curve25519_x86_64 libcurve25519_generic libchacha20poly1305 chacha_x86_64 poly1305_x86_64 ip6_udp_tunnel udp_tunnel libchacha ip6table_filter ip6_tables iptable_filter ip_tables x_tables bridge stp llc bnx2 mgag200 drm_shmem_helper i2c_algo_bit drm_kms_helper drm sr_mod cdrom ipmi_ssif backlight intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel crypto_simd cryptd intel_cstate i2c_core input_leds syscopyarea intel_uncore mpt3sas led_class joydev ahci sysfillrect ata_piix sysimgblt fb_sys_fops
[3547852.650111]  libahci raid_class scsi_transport_sas wmi ipmi_si acpi_power_meter button acpi_cpufreq unix [last unloaded: md_mod]
[3547852.656724] CR2: 0000000000000000
[3547852.658803] ---[ end trace 0000000000000000 ]---
[3547852.674564] RIP: 0010:__rb_erase_color+0xe7/0x1ca
[3547852.675323] Code: 8b 6b 10 f6 45 00 01 75 2f 4c 8b 75 08 48 89 d8 48 89 ee 31 c9 48 83 c8 01 4c 89 ea 48 89 df 4c 89 73 10 48 89 5d 08 4c 89 f5 <49> 89 06 e8 d8 fe ff ff 2e e8 9a be 79 00 48 8b 45 10 48 85 c0 74
[3547852.676900] RSP: 0018:ffffc9000b057d18 EFLAGS: 00010286
[3547852.677696] RAX: ffff888cc8e3a621 RBX: ffff888cc8e3a620 RCX: 0000000000000000
[3547852.678495] RDX: ffff888370fb0cc8 RSI: ffff888cc8e3a1a0 RDI: ffff888cc8e3a620
[3547852.679286] RBP: 0000000000000000 R08: ffff888acf65f520 R09: ffffc9000b057cd0
[3547852.680055] R10: 00001466f75f7000 R11: 00001466f75f8000 R12: ffffffff811d547a
[3547852.680811] R13: ffff888370fb0cc8 R14: 0000000000000000 R15: ffff888cc8e3a600
[3547852.681565] FS:  000014670d18e6c0(0000) GS:ffff88902fb00000(0000) knlGS:0000000000000000
[3547852.682360] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[3547852.683168] CR2: 0000000000000000 CR3: 0000000855544004 CR4: 00000000000226e0
[3548148.759343] br0: port 2(vnet1) entered disabled state
[3548148.762004] device vnet1 left promiscuous mode
[3548148.762668] br0: port 2(vnet1) entered disabled state
[3548940.712053] elogind-daemon[1579]: New session c20 of user root.
[3549039.152378] elogind-daemon[1579]: Removed session c20.

 

flores-diagnostics-20230408-2107.zip

Link to comment

@trwolff04

Seems like I am having more problems since going to unRaid 6.11.x. No problems before.

In addition to this crash condition, my DelugeVPN goes down overnight sometimes(every day or two). A simple restart brings it back up. Trying to look into that as well in a separate posting.

Edited by nraygun
Link to comment

Hmmm, similar sort of thing happening for two users with bittorrent and crashing server with partially inoperative GUI.

Do you know if anyone else is having these seemingly related problems?

 

Also, I'm starting to think it's a kernel bug of somesort. Google shows a handful of reports of this null pointer dereference.

How can we get Limetech folks involved in this? Or do we need to jump on another thread?

Link to comment

Thanks @JorgeB!

My server is not, thankfully, crashing as much as others are reporting.

I guess what we're seeing here is maybe my delugevpn errors and server crashes are related.

 

I'll have to look through this thread for the version of binhex/arch-delugevpn that uses libtorrent 1.x.

Not sure I want to switch out the whole bittorrent VPN container for another one. That'll be my last resort. I'm a relative noob on this stuff.

Edited by nraygun
Link to comment

qbittorrent has always run better for me and binhex has a version available for that.  You need to pull version 4.3.9-2-01 as that is the last one to use libtorrent v1.  Mine is running fine again though I'll know for sure in a week if it doesn't crash.

 

Through ssh: docker pull binhex/arch-qbittorrentvpn:4.3.9-2-01

in unraid apps, search install binhex qbittorrentvpn, under container options specify "binhex/arch-qbittorrentvpn:4.3.9-2-01" next to repository and it should work fine.  I know you don't want to switch but I'm confirmed operational.

Edited by trwolff04
Link to comment

I downgraded to the last version of hexbin-qbittorrent (4.3.9-2-01) that used libtorrentv1 today dpecifically because I was having docker crashes, and sometimes full gui crashes, anywhere between 24 hours and 8 days after starting my server.  This version seems to fix it but I won't know for sure until another week passes.

Edited by trwolff04
Link to comment
56 minutes ago, trwolff04 said:

I downgraded to the last version of hexbin-qbittorrent (4.3.9-2-01) that used libtorrentv1 today dpecifically because I was having docker crashes, and sometimes full gui crashes, anywhere between 24 hours and 8 days after starting my server.  This version seems to fix it but I won't know for sure until another week passes.

Yep, same here.

 

Now we wait!

 

(I'll go update the other post to let them know I went to qbittorrent 4.3.9-2-01)

Link to comment

Had a crash - sort of.

 

I removed one of my drives from the array using a method from SpaceInvader One. That went fine.

At some point during this process, I got what appears to be a crash:

CPU: 22 PID: 18564 Comm: kworker/22:0 Tainted: G I 5.19.17-Unraid #2

Workqueue: events macvlan_process_broadcast [macvlan]

 

Everything seems fine and I didn't notice this crash until I looked in the logs. I've seen mention of this type of crash in other threads.

It's rebuilding the parity since I removed a drive.

 

Once this is done, I'll reboot one more time and monitor.

 

Edited by nraygun
Link to comment
5 hours ago, JorgeB said:

Try switching to ipvlan (Settings -> Docker Settings -> Docker custom network type -> ipvlan (advanced view must be enabled, top right)).

I had tried that and it caused my DuckDNS container to stop working. Putting it back to macvlan restored operation.

Maybe once the parity is done building, I'll try it again with a reboot.

Link to comment
12 hours ago, nraygun said:

Parity done, changed to ipvlan, rebooted - all good I think.

There's an entry about a kernel bug that I'll post in a different thread, but I think the macvlan crash should be OK now.

Nope.

Using ipvlan renders my server inoperable. My server can't ping hosts and my DuckDNS no longer updates.

Going back to macvlan restored all operation.

 

While I get what appears to be crashes, my server stays operational.

There's this post for setting up a separate network on a separated NIC that I might look into: 

https://forums.unraid.net/topic/137048-guide-how-to-solve-macvlan-and-ipvlan-issues-with-containers-on-a-custom-network/

 

 

But as far as the crashes from libtorrent, I think I'm good using the old qbittorrent container with libtorrent 1.x.

 

 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...