RFehr Posted August 13, 2012 Share Posted August 13, 2012 Ok - I'm behind. However, b13 system was EXTREMELY stable for me until a few days ago when I added the latest CouchPotato_v2 plugin and updated Plex media server to the latest (0.9.6.6). Got a red ball on a known good disk. Ran SMART tests to confirm. Tried trust array procedure which resulted in mdcmd invalid argument error already covered in another thread. Started the array anyway (unprotected) and disabled anything that would perform writes - other than mover. I did not realize there was still data on the cache to sync. Nonetheless, scenario should not have caused a panic. I'm updating to latest rc hoping that mdcmd trust array fix is in place. Aug 12 19:00:01 UnRaid logger: mover started Aug 12 19:00:01 UnRaid logger: skipping Apps/ Aug 12 19:00:01 UnRaid logger: moving Media/ Aug 12 19:00:01 UnRaid logger: moving Television/ Aug 12 19:00:14 UnRaid logger: ./Television/The Daily Show with Jon Stewart/The Daily Show with Jon Stewart - s17e134 - Jessica Biel.mkv Aug 12 19:00:14 UnRaid logger: rsync: get_xattr_data: lgetxattr(".","user.org.netatalk.supports-eas.aK2YbM",0) failed: Exec format error ( Aug 12 19:00:14 UnRaid logger: rsync: get_xattr_data: lgetxattr("Television","user.org.netatalk.supports-eas.PvAD6m",0) failed: Exec format error ( Aug 12 19:00:14 UnRaid logger: >f......... Television/The Daily Show with Jon Stewart/The Daily Show with Jon Stewart - s17e134 - Jessica Biel.mkv Aug 12 19:00:35 UnRaid kernel: BUG: unable to handle kernel NULL pointer dereference at 00000056 Aug 12 19:00:35 UnRaid kernel: IP: [<c107e2b4>] __kmalloc_track_caller+0xb7/0xef Aug 12 19:00:35 UnRaid kernel: *pdpt = 000000003746e001 *pde = 0000000000000000 Aug 12 19:00:35 UnRaid kernel: Oops: 0000 [#1] SMP Aug 12 19:00:35 UnRaid kernel: Modules linked in: md_mod xor asus_atk0110 pata_jmicron hwmon sata_promise jmicron ata_piix i2c_i801 i2c_core r8169 [last unloaded: md_mod] Aug 12 19:00:35 UnRaid kernel: Aug 12 19:00:35 UnRaid kernel: Pid: 21135, comm: shfs Tainted: G W 3.1.0-unRAID #2 System manufacturer System Product Name/P5QL/EPU Aug 12 19:00:35 UnRaid kernel: EIP: 0060:[<c107e2b4>] EFLAGS: 00010206 CPU: 0 Aug 12 19:00:35 UnRaid kernel: EIP is at __kmalloc_track_caller+0xb7/0xef Aug 12 19:00:35 UnRaid kernel: EAX: 00000000 EBX: f46ce900 ECX: 00004201 EDX: 00004201 Aug 12 19:00:35 UnRaid kernel: ESI: f1c02580 EDI: 00000056 EBP: f1c0ff08 ESP: f1c0feec Aug 12 19:00:35 UnRaid kernel: DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 Aug 12 19:00:35 UnRaid kernel: Process shfs (pid: 21135, ti=f1c0e000 task=f1d86f40 task.ti=f4764000) Aug 12 19:00:35 UnRaid kernel: Stack: Aug 12 19:00:35 UnRaid kernel: c128f717 00004201 00000020 00000056 f46ce900 00000020 000001fa f1c0ff28 Aug 12 19:00:35 UnRaid kernel: c128e9ce 00000000 f1c02180 000001c0 f158f000 f75d8000 00000000 f1c0ff38 Aug 12 19:00:35 UnRaid kernel: c128f717 ffffffff f158f000 f1c0ff80 f8427c6c f158f40c f158f400 f158fabc Aug 12 19:00:35 UnRaid kernel: Call Trace: Aug 12 19:00:35 UnRaid kernel: [<c128f717>] ? __netdev_alloc_skb+0x16/0x31 Aug 12 19:00:35 UnRaid kernel: [<c128e9ce>] __alloc_skb+0x53/0xf1 Aug 12 19:00:35 UnRaid kernel: [<c128f717>] __netdev_alloc_skb+0x16/0x31 Aug 12 19:00:35 UnRaid kernel: [<f8427c6c>] rtl8169_rx_interrupt+0x19c/0x2c4 [r8169] Aug 12 19:00:35 UnRaid kernel: [<f8427dbd>] rtl8169_poll+0x29/0x126 [r8169] Aug 12 19:00:35 UnRaid kernel: [<c12948fb>] net_rx_action+0x59/0x12a Aug 12 19:00:35 UnRaid kernel: [<c102c9da>] __do_softirq+0x6b/0xe5 Aug 12 19:00:35 UnRaid kernel: [<c102c96f>] ? irq_enter+0x3c/0x3c Aug 12 19:00:35 UnRaid kernel: <IRQ> Aug 12 19:00:35 UnRaid kernel: [<c102c82d>] ? irq_exit+0x32/0x53 Aug 12 19:00:35 UnRaid kernel: [<c10035ab>] ? do_IRQ+0x7c/0x90 Aug 12 19:00:35 UnRaid kernel: [<c130bea9>] ? common_interrupt+0x29/0x30 Aug 12 19:00:35 UnRaid kernel: [<c1300000>] ? quirk_usb_handoff_xhci+0x85/0x1ac Aug 12 19:00:35 UnRaid kernel: Code: 8b 08 85 c9 89 4d f0 75 16 8b 7d e4 8b 55 ec 50 89 f0 89 f9 e8 12 ec ff ff 59 89 45 f0 eb 1f 8b 7d f0 8b 46 14 8b 4d e8 8b 55 e8 <8b> 1c 07 89 f8 41 8b 3e 64 0f c7 0f 0f 94 c0 84 c0 74 b3 83 7d Aug 12 19:00:35 UnRaid kernel: EIP: [<c107e2b4>] __kmalloc_track_caller+0xb7/0xef SS:ESP 0068:f1c0feec Aug 12 19:00:35 UnRaid kernel: CR2: 0000000000000056 Aug 12 19:00:35 UnRaid kernel: ---[ end trace cb01a034686e189d ]--- Aug 12 19:00:35 UnRaid kernel: Kernel panic - not syncing: Fatal exception in interrupt Aug 12 19:00:35 UnRaid kernel: Pid: 21135, comm: shfs Tainted: G D W 3.1.0-unRAID #2 Aug 12 19:00:35 UnRaid kernel: Call Trace: Aug 12 19:00:35 UnRaid kernel: [<c13092db>] panic+0x50/0x143 Aug 12 19:00:35 UnRaid kernel: [<c100484e>] oops_end+0x6e/0x7c Aug 12 19:00:35 UnRaid kernel: [<c101adf1>] no_context+0xac/0xb6 Aug 12 19:00:35 UnRaid kernel: [<c101aee3>] __bad_area_nosemaphore+0xe8/0xf0 Aug 12 19:00:35 UnRaid kernel: [<c101b09a>] ? mm_fault_error+0x129/0x129 Aug 12 19:00:35 UnRaid kernel: [<c101aef8>] bad_area_nosemaphore+0xd/0x10 Aug 12 19:00:35 UnRaid kernel: [<c101b1eb>] do_page_fault+0x151/0x332 Aug 12 19:00:35 UnRaid kernel: [<c128e9a4>] ? __alloc_skb+0x29/0xf1 Aug 12 19:00:35 UnRaid kernel: [<c123033d>] ? ata_scsi_translate+0xbf/0xed Aug 12 19:00:35 UnRaid kernel: [<c101b09a>] ? mm_fault_error+0x12unraid: Host is down Link to comment
dgaschk Posted August 13, 2012 Share Posted August 13, 2012 The drive may be good but its contents are guaranteed to be wrong. The disk was disabled because a write to it failed. The disk needs to be rebuilt. Link to comment
RFehr Posted August 14, 2012 Author Share Posted August 14, 2012 Drive contents (at a file level) are perfect - mounted independently and did a diff against my production server (which is an rsync mirror master). reiserfsck also yielded no issues. Disk is fine. Got the disk back online - no big problem. Very likely a crappy SATA cable issue. However, that's not REALLY the issue - regardless of whether I have a redball disk, I should not be getting a kernel panic every time mover runs. For giggles, I got my test array back online - all green - checked parity, etc. Then I updated to rc5 and turned mover back on. Same panic. Now.... my test server is not exactly going to replace a Google datacenter anytime soon - it's not much more than a pile of parts that were on their way to a dumpster (like... disks, a PS, and a MB laying in the corner of an unused bench - excellent thermal performance when you don't bother using a case). But having said that, my uptime with b13 running CP/SB/Sabnzbd/CrashPlan/PMS/Hamachi was pretty darn huge. More to follow. Link to comment
PeteAron Posted August 27, 2012 Share Posted August 27, 2012 I am still running my production server on b13. I am contemplating upgrading to a newer version, but since my server has been highly reliable using this beta I am reluctant to change. I just replaced an older 1 TB drive with a new 3 TB drive and everything appears to have worked smoothly. I am not running any add-ons other than unmenu. Link to comment
Recommended Posts
Archived
This topic is now archived and is closed to further replies.