Jump to content

kernel panic - B13 (yes, I know...)


RFehr

Recommended Posts

Ok - I'm behind.  However, b13 system was EXTREMELY stable for me until a few days ago when I added the latest CouchPotato_v2 plugin and updated Plex media server to the latest (0.9.6.6).

 

Got a red ball on a known good disk. Ran SMART tests to confirm.  Tried trust array procedure which resulted in mdcmd invalid argument error already covered in another thread.  Started the array anyway (unprotected) and disabled anything that would perform writes - other than mover.  I did not realize there was still data on the cache to sync.  Nonetheless, scenario should not have caused a panic.

 

I'm updating to latest rc hoping that mdcmd trust array fix is in place.

 

Aug 12 19:00:01 UnRaid logger: mover started
Aug 12 19:00:01 UnRaid logger: skipping Apps/
Aug 12 19:00:01 UnRaid logger: moving Media/
Aug 12 19:00:01 UnRaid logger: moving Television/
Aug 12 19:00:14 UnRaid logger: ./Television/The Daily Show with Jon Stewart/The Daily Show with Jon Stewart - s17e134 - Jessica Biel.mkv
Aug 12 19:00:14 UnRaid logger: rsync: get_xattr_data: lgetxattr(".","user.org.netatalk.supports-eas.aK2YbM",0) failed: Exec format error (
Aug 12 19:00:14 UnRaid logger: rsync: get_xattr_data: lgetxattr("Television","user.org.netatalk.supports-eas.PvAD6m",0) failed: Exec format error (
Aug 12 19:00:14 UnRaid logger: >f......... Television/The Daily Show with Jon Stewart/The Daily Show with Jon Stewart - s17e134 - Jessica Biel.mkv
Aug 12 19:00:35 UnRaid kernel: BUG: unable to handle kernel NULL pointer dereference at 00000056
Aug 12 19:00:35 UnRaid kernel: IP: [<c107e2b4>] __kmalloc_track_caller+0xb7/0xef
Aug 12 19:00:35 UnRaid kernel: *pdpt = 000000003746e001 *pde = 0000000000000000 
Aug 12 19:00:35 UnRaid kernel: Oops: 0000 [#1] SMP 
Aug 12 19:00:35 UnRaid kernel: Modules linked in: md_mod xor asus_atk0110 pata_jmicron hwmon sata_promise jmicron ata_piix i2c_i801 i2c_core r8169 [last unloaded: md_mod]
Aug 12 19:00:35 UnRaid kernel: 
Aug 12 19:00:35 UnRaid kernel: Pid: 21135, comm: shfs Tainted: G        W   3.1.0-unRAID #2 System manufacturer System Product Name/P5QL/EPU
Aug 12 19:00:35 UnRaid kernel: EIP: 0060:[<c107e2b4>] EFLAGS: 00010206 CPU: 0
Aug 12 19:00:35 UnRaid kernel: EIP is at __kmalloc_track_caller+0xb7/0xef
Aug 12 19:00:35 UnRaid kernel: EAX: 00000000 EBX: f46ce900 ECX: 00004201 EDX: 00004201
Aug 12 19:00:35 UnRaid kernel: ESI: f1c02580 EDI: 00000056 EBP: f1c0ff08 ESP: f1c0feec
Aug 12 19:00:35 UnRaid kernel:  DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
Aug 12 19:00:35 UnRaid kernel: Process shfs (pid: 21135, ti=f1c0e000 task=f1d86f40 task.ti=f4764000)
Aug 12 19:00:35 UnRaid kernel: Stack:
Aug 12 19:00:35 UnRaid kernel:  c128f717 00004201 00000020 00000056 f46ce900 00000020 000001fa f1c0ff28
Aug 12 19:00:35 UnRaid kernel:  c128e9ce 00000000 f1c02180 000001c0 f158f000 f75d8000 00000000 f1c0ff38
Aug 12 19:00:35 UnRaid kernel:  c128f717 ffffffff f158f000 f1c0ff80 f8427c6c f158f40c f158f400 f158fabc
Aug 12 19:00:35 UnRaid kernel: Call Trace:
Aug 12 19:00:35 UnRaid kernel:  [<c128f717>] ? __netdev_alloc_skb+0x16/0x31
Aug 12 19:00:35 UnRaid kernel:  [<c128e9ce>] __alloc_skb+0x53/0xf1
Aug 12 19:00:35 UnRaid kernel:  [<c128f717>] __netdev_alloc_skb+0x16/0x31
Aug 12 19:00:35 UnRaid kernel:  [<f8427c6c>] rtl8169_rx_interrupt+0x19c/0x2c4 [r8169]
Aug 12 19:00:35 UnRaid kernel:  [<f8427dbd>] rtl8169_poll+0x29/0x126 [r8169]
Aug 12 19:00:35 UnRaid kernel:  [<c12948fb>] net_rx_action+0x59/0x12a
Aug 12 19:00:35 UnRaid kernel:  [<c102c9da>] __do_softirq+0x6b/0xe5
Aug 12 19:00:35 UnRaid kernel:  [<c102c96f>] ? irq_enter+0x3c/0x3c
Aug 12 19:00:35 UnRaid kernel:  <IRQ> 
Aug 12 19:00:35 UnRaid kernel:  [<c102c82d>] ? irq_exit+0x32/0x53
Aug 12 19:00:35 UnRaid kernel:  [<c10035ab>] ? do_IRQ+0x7c/0x90
Aug 12 19:00:35 UnRaid kernel:  [<c130bea9>] ? common_interrupt+0x29/0x30
Aug 12 19:00:35 UnRaid kernel:  [<c1300000>] ? quirk_usb_handoff_xhci+0x85/0x1ac
Aug 12 19:00:35 UnRaid kernel: Code: 8b 08 85 c9 89 4d f0 75 16 8b 7d e4 8b 55 ec 50 89 f0 89 f9 e8 12 ec ff ff 59 89 45 f0 eb 1f 8b 7d f0 8b 46 14 8b 4d e8 8b 55 e8 <8b> 1c 07 89 f8 41 8b 3e 64 0f c7 0f 0f 94 c0 84 c0 74 b3 83 7d 
Aug 12 19:00:35 UnRaid kernel: EIP: [<c107e2b4>] __kmalloc_track_caller+0xb7/0xef SS:ESP 0068:f1c0feec
Aug 12 19:00:35 UnRaid kernel: CR2: 0000000000000056
Aug 12 19:00:35 UnRaid kernel: ---[ end trace cb01a034686e189d ]---
Aug 12 19:00:35 UnRaid kernel: Kernel panic - not syncing: Fatal exception in interrupt
Aug 12 19:00:35 UnRaid kernel: Pid: 21135, comm: shfs Tainted: G      D W   3.1.0-unRAID #2
Aug 12 19:00:35 UnRaid kernel: Call Trace:
Aug 12 19:00:35 UnRaid kernel:  [<c13092db>] panic+0x50/0x143
Aug 12 19:00:35 UnRaid kernel:  [<c100484e>] oops_end+0x6e/0x7c
Aug 12 19:00:35 UnRaid kernel:  [<c101adf1>] no_context+0xac/0xb6
Aug 12 19:00:35 UnRaid kernel:  [<c101aee3>] __bad_area_nosemaphore+0xe8/0xf0
Aug 12 19:00:35 UnRaid kernel:  [<c101b09a>] ? mm_fault_error+0x129/0x129
Aug 12 19:00:35 UnRaid kernel:  [<c101aef8>] bad_area_nosemaphore+0xd/0x10
Aug 12 19:00:35 UnRaid kernel:  [<c101b1eb>] do_page_fault+0x151/0x332
Aug 12 19:00:35 UnRaid kernel:  [<c128e9a4>] ? __alloc_skb+0x29/0xf1
Aug 12 19:00:35 UnRaid kernel:  [<c123033d>] ? ata_scsi_translate+0xbf/0xed
Aug 12 19:00:35 UnRaid kernel:  [<c101b09a>] ? mm_fault_error+0x12unraid: Host is down

Link to comment

Drive contents (at a file level) are perfect - mounted independently and did a diff against my production server (which is an rsync mirror master).  reiserfsck also yielded no issues.  Disk is fine.  Got the disk back online - no big problem.  Very likely a crappy SATA cable issue.

 

However, that's not REALLY the issue - regardless of whether I have a redball disk, I should not be getting a kernel panic every time mover runs.

 

For giggles, I got my test array back online - all green - checked parity, etc.  Then I updated to rc5 and turned mover back on.  Same panic.

 

Now.... my test server is not exactly going to replace a Google datacenter anytime soon - it's not much more than a pile of parts that were on their way to a dumpster (like... disks, a PS, and a MB laying in the corner of an unused bench - excellent thermal performance when you don't bother using a case).  But having said that, my uptime with b13 running CP/SB/Sabnzbd/CrashPlan/PMS/Hamachi was pretty darn huge.

 

More to follow.

Link to comment
  • 2 weeks later...

I am still running my production server on b13.  I am contemplating upgrading to a newer version, but since my server has been highly reliable using this beta I am reluctant to change.  I just replaced an older 1 TB drive with a new 3 TB drive and everything appears to have worked smoothly. 

 

I am not running any add-ons other than unmenu. 

Link to comment

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...