Stokkes Posted May 28, 2012 Share Posted May 28, 2012 Hey, I can't speak to the previous 5.0 rc versions but I had been running beta14 since January without any problems. Decided to upgrade to 5.0-rc3 and I've been getting some crazy stack traces / crashes (these appear on the monitor connected directly to the unraid server). Once this happens, the server is completely unresponsive (telnet, web interfaces stop working) and needs to be shutdown using the power button (thus losing the syslog?). Reverting back to b14 fixes these issues completely. Incidentally, if I DO run an addon (Plex Media Server), these continue to work even after the crash (I'm thinking maybe it's related to emhttp?). I have also disabled all my addons but I keep getting crashes every few days. My previous uptime record with b14 was about 7 weeks. 5.0-rc3 I can't even get 2-3 days. Any ideas? Quote Link to comment
Joe L. Posted May 28, 2012 Share Posted May 28, 2012 Post picture of screen for anaysis Quote Link to comment
jdog09 Posted May 30, 2012 Share Posted May 30, 2012 Likewise, RC14 worked like a champ... I will snap a picture of the Stack traces on the monitor next time it goes toes up... I just rebooted so it will be a few days.. Quote Link to comment
Stokkes Posted May 31, 2012 Author Share Posted May 31, 2012 Here's a shot of my stack trace (the server is now inaccessible, no syslog obviously): Quote Link to comment
Joe L. Posted June 1, 2012 Share Posted June 1, 2012 looks like it was trying to allocate memory. How much memory do you have installed? What add-ons, if any are (were) running? Quote Link to comment
Stokkes Posted June 1, 2012 Author Share Posted June 1, 2012 2GB installed. No addons running. Quote Link to comment
limetech Posted June 1, 2012 Share Posted June 1, 2012 Please try this: get your server back up, and then open a telnet session on your PC and type: tail -f /var/log/syslog Now let it run. I want to see what gets logged to the system log before the crash. Note: could also click on the 'Log' button on the webGui menu bar, but there is more code between log entries getting generated and them getting sent to your browser and it might not catch as much. Quote Link to comment
Stokkes Posted June 1, 2012 Author Share Posted June 1, 2012 Please try this: get your server back up, and then open a telnet session on your PC and type: tail -f /var/log/syslog Now let it run. I want to see what gets logged to the system log before the crash. Note: could also click on the 'Log' button on the webGui menu bar, but there is more code between log entries getting generated and them getting sent to your browser and it might not catch as much. I've got a terminal session open on my Mac Mini that'll run t'll the next crash. Will post the contents of the log once it happens. As a note, I also re-seated by ram and changed it slots. I may look for more RAM sitting around that I could use in case that's what it is. Quote Link to comment
limetech Posted June 1, 2012 Share Posted June 1, 2012 Please try this: get your server back up, and then open a telnet session on your PC and type: tail -f /var/log/syslog Now let it run. I want to see what gets logged to the system log before the crash. Note: could also click on the 'Log' button on the webGui menu bar, but there is more code between log entries getting generated and them getting sent to your browser and it might not catch as much. I've got a terminal session open on my Mac Mini that'll run t'll the next crash. Will post the contents of the log once it happens. As a note, I also re-seated by ram and changed it slots. I may look for more RAM sitting around that I could use in case that's what it is. Of course you can run memtest for a pass or two to see if it finds anything, though I have seen cases of bad ram that passed memtest. Another test would be to pull out a stick and just run with 1G of memory. If that fails, pull it out and try the other stick. Quote Link to comment
Stokkes Posted June 1, 2012 Author Share Posted June 1, 2012 I'll try one thing at a time Will see what the syslog shows during the crash then I'll post. I'll also try removing sticks and running extended memtest. I ran a few runs and it passed so who knows. Will be in touch. Quote Link to comment
jdog09 Posted June 1, 2012 Share Posted June 1, 2012 I am down again too... When I get home and physically reboot, I can take a picture of stack trace on Monitor. I will open a tail -f on Syslog and let it run.... Quote Link to comment
jdog09 Posted June 3, 2012 Share Posted June 3, 2012 Here is the Syslog.. Didn't even get through the parity check... A lot of R8139 Messages Unraiderro20120601.txt Quote Link to comment
RobJ Posted June 3, 2012 Share Posted June 3, 2012 Here is the Syslog.. Didn't even get through the parity check... A lot of R8139 Messages In the tail report you attached, there are no stack traces or anything related to that, but you do have an issue with your Realtek support (a different issue than this thread was about, but still significant). You appear to have the same problem that Concorde Rules has with his Realtek network chipset (an example here and my reply here and pantner's reply here). Basically, this particular Realtek chipset appears to work for awhile, then misbehaves badly and spews 'eth0: link up' messages, with little to no network accessibility afterward. Tom included a firmware patch for Realtek chipsets in the -rc3 release, but it does not appear to completely fix the problem (although complaints seem to be less now). Quote Link to comment
Stokkes Posted June 4, 2012 Author Share Posted June 4, 2012 I now know what is causing my stack traces.. The mover script. Here's what I captured in my tail before it died: stewie kernel: Oops: 0000 [#1] SMP Jun 3 21:22:05 stewie kernel: Call Trace: Jun 3 21:22:05 stewie kernel: [<c130ce2f>] panic+0x50/0x13a Jun 3 21:22:05 stewie kernel: [<c100490a>] oops_end+0x6e/0x7c Jun 3 21:22:05 stewie kernel: [<c101b2d9>] no_context+0xac/0xb6 Jun 3 21:22:05 stewie kernel: [<c101b3cb>] __bad_area_nosemaphore+0xe8/0xf0 Jun 3 21:22:05 stewie kernel: [<c101b582>] ? mm_fault_error+0x129/0x129 Jun 3 21:22:05 stewie kernel: [<c101b3e0>] bad_area_nosemaphore+0xd/0x10 Jun 3 21:22:05 stewie kernel: [<c101b6d2>] do_page_fault+0x150/0x332 Jun 3 21:22:05 stewie kernel: [<c12af4f9>] ? ip_local_deliver+0x61/0x66 Jun 3 21:22:05 stewie kernel: [<c12af06f>] ? ip_rcv_finish+0x263/0x28b Jun 3 21:22:05 stewie kernel: [<c12af2d0>] ? ip_rcv+0x239/0x26f Jun 3 21:22:05 stewie kernel: [<c1295651>] ? __netif_receive_skb+0x23a/0x260 Jun 3 21:22:05 stewie kernel: [<c101b582>] ? mm_fault_error+0x129/0x129 Jun 3 21:22:05 stewie kernel: [<c130f302>] error_code+0x5a/0x60 Jun 3 21:22:05 stewie kernel: [<c12900d8>] ? skb_copy_and_csum_bits+0xff/0x221 Jun 3 21:22:05 stewie kernel: [<c101b582>] ? mm_fault_error+0x129/0x129 Jun 3 21:22:05 stewie kernel: [<f848fbab>] ? sky2_tx_complete+0x74/0xb8 [sky2] Jun 3 21:22:05 stewie kernel: [<f8492481>] s Not sure why I didn't get more, I confirmed this by executing the mover manually a few minutes ago and in killed my server. Quote Link to comment
Stokkes Posted June 4, 2012 Author Share Posted June 4, 2012 The mover log I managed to catch: mover started skipping TimeMachine/ skipping data/ moving newsgroups/ moving videos/ ./videos/.AppleDouble/.apdisk .d..t...... ./ rsync: get_xattr_names: llistxattr("videos",1024) failed: Software caused connection abort (103) .d..t.....x videos/ rsync: get_acl: sys_acl_get_file(videos, ACL_TYPE_ACCESS): Transport endpoint is not connected (107) rsync: failed to set times on "/mnt/user0/videos": Transport endpoint is not connected (107) cd+++++++++ videos/.AppleDouble/ rsync: recv_generator: mkdir "/mnt/user0/videos/.AppleDouble" failed: Transport endpoint is not connected (107) *** Skipping any contents from this failed directory *** rsync: get_acl: sys_acl_get_file(videos, ACL_TYPE_ACCESS): Transport endpoint is not connected (107) rsync: get_xattr_names: llistxattr("videos",1024) failed: Transport endpoint is not connected (107) rsync: failed to set times on "/mnt/user0/videos": Transport endpoint is not connected (107) rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1042) [sender=3.0.7] ./videos/.apdisk rsync: get_acl: sys_acl_get_file(., ACL_TYPE_ACCESS): Transport endpoint is not connected (107) rsync: get_xattr_names: llistxattr(".",1024) failed: Transport endpoint is not connected (107) .d..t....a. ./ rsync: get_acl: sys_acl_get_file(., ACL_TYPE_ACCESS): Transport endpoint is not connected (107) rsync: failed to set times on "/mnt/user0/.": Transport endpoint is not connected (107) rsync: get_acl: sys_acl_get_file(videos, ACL_TYPE_ACCESS): Transport endpoint is not connected (107) rsync: get_xattr_names: llistxattr("videos",1024) failed: Transport endpoint is not connected (107) .d..t....ax videos/ rsync: get_acl: sys_acl_get_file(videos, ACL_TYPE_ACCESS): Transport endpoint is not connected (107) rsync: failed to set times on "/mnt/user0/videos": Transport endpoint is not connected (107) rsync: recv_generator: failed to stat "/mnt/user0/videos/.apdisk": Transport endpoint is not connected (107) rsync: get_acl: sys_acl_get_file(videos, ACL_TYPE_ACCESS): Transport endpoint is not connected (107) rsync: get_xattr_names: llistxattr("videos",1024) failed: Transport endpoint is not connected (107) rsync: failed to set times on "/mnt/user0/videos": Transport endpoint is not connected (107) ---TRIMMED---- (too long to post) ./videos/.AppleDB rsync: ERROR: cannot stat destination "/mnt/user0/": Transport endpoint is not connected (107) rsync error: errors selecting input/output files, dirs (code 3) at main.c(555) [Receiver=3.0.7] rsync: connection unexpectedly closed (9 bytes received so far) [sender] rsync error: error in rsync protocol data stream (code 12) at io.c(601) [sender=3.0.7] mover finished Quote Link to comment
Stokkes Posted June 4, 2012 Author Share Posted June 4, 2012 Executed the mover AGAIN (removed all the AFP crap, I thought maybe that could cause some issues, but alas)..: mover started skipping TimeMachine/ skipping data/ moving newsgroups/ moving videos/ ./videos/movies/movie.mkv .d..t...... ./ rsync: get_xattr_names: llistxattr("videos",1024) failed: Software caused connection abort (103) .d..t.....x videos/ rsync: get_acl: sys_acl_get_file(videos, ACL_TYPE_ACCESS): Transport endpoint is not connected (107) rsync: failed to set times on "/mnt/user0/videos": Transport endpoint is not connected (107) cd+++++++++ videos/movies/ rsync: recv_generator: mkdir "/mnt/user0/videos/movies" failed: Transport endpoint is not connected (107) *** Skipping any contents from this failed directory *** rsync: get_acl: sys_acl_get_file(videos, ACL_TYPE_ACCESS): Transport endpoint is not connected (107) rsync: get_xattr_names: llistxattr("videos",1024) failed: Transport endpoint is not connected (107) rsync: failed to set times on "/mnt/user0/videos": Transport endpoint is not connected (107) rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1042) [sender=3.0.7] find: `fuser' terminated by signal 9 Message from syslogd@stewie at Sun Jun 3 21:46:55 2012 ... stewie kernel: Call Trace: Message from syslogd@stewie at Sun Jun 3 21:46:55 2012 ... stewie kernel: Oops: 0000 [#1] SMP Message from syslogd@stewie at Sun Jun 3 21:46:55 2012 ... stewie kernel: Process fuser (pid: 8949, ti=e9e0e000 task=db56f020 task.ti=e9e0e000) Message from syslogd@stewie at Sun Jun 3 21:46:55 2012 ... stewie kernel: Code: c7 0f 84 ad 00 00 00 8d 40 40 66 c7 03 00 a0 e8 f4 b7 25 00 8b 47 04 3b 30 0f 83 8b 00 00 00 c1 e6 02 03 70 04 8b 06 85 c0 74 7f <f6> 40 24 01 74 05 66 81 0b 40 01 f6 40 24 02 74 05 66 81 0b c0 Message from syslogd@stewie at Sun Jun 3 21:46:55 2012 ... stewie kernel: CR2: 0000000074617489 Message from syslogd@stewie at Sun Jun 3 21:46:55 2012 ... stewie kernel: EIP: [<c10b3252>] proc_fd_instantiate+0x65/0xf7 SS:ESP 0068:e9e0feb4 Message from syslogd@stewie at Sun Jun 3 21:46:55 2012 ... stewie kernel: Stack: Quote Link to comment
steved Posted June 10, 2012 Share Posted June 10, 2012 I am having this same issue, I have did memory test, replaced memory and powersupply and have not resolved the issue. I am now downgrading to 4.7 since everything was stable until the 5 rc 3 upgrade. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.