Stack traces / crashes


Recommended Posts

Hey,

 

I can't speak to the previous 5.0 rc versions but I had been running beta14 since January without any problems. Decided to upgrade to 5.0-rc3 and I've been getting some crazy stack traces / crashes (these appear on the monitor connected directly to the unraid server). Once this happens, the server is completely unresponsive (telnet, web interfaces stop working) and needs to be shutdown using the power button (thus losing the syslog?). Reverting back to b14 fixes these issues completely.

 

Incidentally, if I DO run an addon (Plex Media Server), these continue to work even after the crash (I'm thinking maybe it's related to emhttp?). I have also disabled all my addons but I keep getting crashes every few days. My previous uptime record with b14 was about 7 weeks. 5.0-rc3 I can't even get 2-3 days.

 

Any ideas?

 

Link to comment

Please try this: get your server back up, and then open a telnet session on your PC and type:

 

tail -f /var/log/syslog

 

Now let it run.  I want to see what gets logged to the system log before the crash.

 

Note: could also click on the 'Log' button on the webGui menu bar, but there is more code between log entries getting generated and them getting sent to your browser and it might not catch as much.

Link to comment

Please try this: get your server back up, and then open a telnet session on your PC and type:

 

tail -f /var/log/syslog

 

Now let it run.  I want to see what gets logged to the system log before the crash.

 

Note: could also click on the 'Log' button on the webGui menu bar, but there is more code between log entries getting generated and them getting sent to your browser and it might not catch as much.

 

I've got a terminal session open on my Mac Mini that'll run t'll the next crash. Will post the contents of the log once it happens.

 

As a note, I also re-seated by ram and changed it slots. I may look for more RAM sitting around that I could use in case that's what it is.

Link to comment

Please try this: get your server back up, and then open a telnet session on your PC and type:

 

tail -f /var/log/syslog

 

Now let it run.  I want to see what gets logged to the system log before the crash.

 

Note: could also click on the 'Log' button on the webGui menu bar, but there is more code between log entries getting generated and them getting sent to your browser and it might not catch as much.

 

I've got a terminal session open on my Mac Mini that'll run t'll the next crash. Will post the contents of the log once it happens.

 

As a note, I also re-seated by ram and changed it slots. I may look for more RAM sitting around that I could use in case that's what it is.

 

Of course you can run memtest for a pass or two to see if it finds anything, though I have seen cases of bad ram that passed memtest.  Another test would be to pull out a stick and just run with 1G of memory.  If that fails, pull it out and try the other stick.

Link to comment

Here is the Syslog.. Didn't even get through the parity check...  A lot of R8139 Messages

 

In the tail report you attached, there are no stack traces or anything related to that, but you do have an issue with your Realtek support (a different issue than this thread was about, but still significant).  You appear to have the same problem that Concorde Rules has with his Realtek network chipset (an example here and my reply here and pantner's reply here).  Basically, this particular Realtek chipset appears to work for awhile, then misbehaves badly and spews 'eth0: link up' messages, with little to no network accessibility afterward.  Tom included a firmware patch for Realtek chipsets in the -rc3 release, but it does not appear to completely fix the problem (although complaints seem to be less now).

Link to comment

I now know what is causing my stack traces.. The mover script.

 

Here's what I captured in my tail before it died:

 

stewie kernel: Oops: 0000 [#1] SMP

Jun  3 21:22:05 stewie kernel: Call Trace:

Jun  3 21:22:05 stewie kernel:  [<c130ce2f>] panic+0x50/0x13a

Jun  3 21:22:05 stewie kernel:  [<c100490a>] oops_end+0x6e/0x7c

Jun  3 21:22:05 stewie kernel:  [<c101b2d9>] no_context+0xac/0xb6

Jun  3 21:22:05 stewie kernel:  [<c101b3cb>] __bad_area_nosemaphore+0xe8/0xf0

Jun  3 21:22:05 stewie kernel:  [<c101b582>] ? mm_fault_error+0x129/0x129

Jun  3 21:22:05 stewie kernel:  [<c101b3e0>] bad_area_nosemaphore+0xd/0x10

Jun  3 21:22:05 stewie kernel:  [<c101b6d2>] do_page_fault+0x150/0x332

Jun  3 21:22:05 stewie kernel:  [<c12af4f9>] ? ip_local_deliver+0x61/0x66

Jun  3 21:22:05 stewie kernel:  [<c12af06f>] ? ip_rcv_finish+0x263/0x28b

Jun  3 21:22:05 stewie kernel:  [<c12af2d0>] ? ip_rcv+0x239/0x26f

Jun  3 21:22:05 stewie kernel:  [<c1295651>] ? __netif_receive_skb+0x23a/0x260

Jun  3 21:22:05 stewie kernel:  [<c101b582>] ? mm_fault_error+0x129/0x129

Jun  3 21:22:05 stewie kernel:  [<c130f302>] error_code+0x5a/0x60

Jun  3 21:22:05 stewie kernel:  [<c12900d8>] ? skb_copy_and_csum_bits+0xff/0x221

Jun  3 21:22:05 stewie kernel:  [<c101b582>] ? mm_fault_error+0x129/0x129

Jun  3 21:22:05 stewie kernel:  [<f848fbab>] ? sky2_tx_complete+0x74/0xb8 [sky2]

Jun  3 21:22:05 stewie kernel:  [<f8492481>] s

 

Not sure why I didn't get more, I confirmed this by executing the mover manually a few minutes ago and in killed my server.

 

Link to comment

The mover log I managed to catch:

 

mover started

skipping TimeMachine/

skipping data/

moving newsgroups/

moving videos/

./videos/.AppleDouble/.apdisk

.d..t...... ./

rsync: get_xattr_names: llistxattr("videos",1024) failed: Software caused connection abort (103)

.d..t.....x videos/

rsync: get_acl: sys_acl_get_file(videos, ACL_TYPE_ACCESS): Transport endpoint is not connected (107)

rsync: failed to set times on "/mnt/user0/videos": Transport endpoint is not connected (107)

cd+++++++++ videos/.AppleDouble/

rsync: recv_generator: mkdir "/mnt/user0/videos/.AppleDouble" failed: Transport endpoint is not connected (107)

*** Skipping any contents from this failed directory ***

rsync: get_acl: sys_acl_get_file(videos, ACL_TYPE_ACCESS): Transport endpoint is not connected (107)

rsync: get_xattr_names: llistxattr("videos",1024) failed: Transport endpoint is not connected (107)

rsync: failed to set times on "/mnt/user0/videos": Transport endpoint is not connected (107)

rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1042) [sender=3.0.7]

./videos/.apdisk

rsync: get_acl: sys_acl_get_file(., ACL_TYPE_ACCESS): Transport endpoint is not connected (107)

rsync: get_xattr_names: llistxattr(".",1024) failed: Transport endpoint is not connected (107)

.d..t....a. ./

rsync: get_acl: sys_acl_get_file(., ACL_TYPE_ACCESS): Transport endpoint is not connected (107)

rsync: failed to set times on "/mnt/user0/.": Transport endpoint is not connected (107)

rsync: get_acl: sys_acl_get_file(videos, ACL_TYPE_ACCESS): Transport endpoint is not connected (107)

rsync: get_xattr_names: llistxattr("videos",1024) failed: Transport endpoint is not connected (107)

.d..t....ax videos/

rsync: get_acl: sys_acl_get_file(videos, ACL_TYPE_ACCESS): Transport endpoint is not connected (107)

rsync: failed to set times on "/mnt/user0/videos": Transport endpoint is not connected (107)

rsync: recv_generator: failed to stat "/mnt/user0/videos/.apdisk": Transport endpoint is not connected (107)

rsync: get_acl: sys_acl_get_file(videos, ACL_TYPE_ACCESS): Transport endpoint is not connected (107)

rsync: get_xattr_names: llistxattr("videos",1024) failed: Transport endpoint is not connected (107)

rsync: failed to set times on "/mnt/user0/videos": Transport endpoint is not connected (107)

---TRIMMED---- (too long to post)

./videos/.AppleDB

rsync: ERROR: cannot stat destination "/mnt/user0/": Transport endpoint is not connected (107)

rsync error: errors selecting input/output files, dirs (code 3) at main.c(555) [Receiver=3.0.7]

rsync: connection unexpectedly closed (9 bytes received so far) [sender]

rsync error: error in rsync protocol data stream (code 12) at io.c(601) [sender=3.0.7]

mover finished

Link to comment

Executed the mover AGAIN (removed all the AFP crap, I thought maybe that could cause some issues, but alas)..:

 

mover started

skipping TimeMachine/

skipping data/

moving newsgroups/

moving videos/

./videos/movies/movie.mkv

.d..t...... ./

rsync: get_xattr_names: llistxattr("videos",1024) failed: Software caused connection abort (103)

.d..t.....x videos/

rsync: get_acl: sys_acl_get_file(videos, ACL_TYPE_ACCESS): Transport endpoint is not connected (107)

rsync: failed to set times on "/mnt/user0/videos": Transport endpoint is not connected (107)

cd+++++++++ videos/movies/

rsync: recv_generator: mkdir "/mnt/user0/videos/movies" failed: Transport endpoint is not connected (107)

*** Skipping any contents from this failed directory ***

rsync: get_acl: sys_acl_get_file(videos, ACL_TYPE_ACCESS): Transport endpoint is not connected (107)

rsync: get_xattr_names: llistxattr("videos",1024) failed: Transport endpoint is not connected (107)

rsync: failed to set times on "/mnt/user0/videos": Transport endpoint is not connected (107)

rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1042) [sender=3.0.7]

find: `fuser' terminated by signal 9

 

Message from syslogd@stewie at Sun Jun  3 21:46:55 2012 ...

stewie kernel: Call Trace:

 

Message from syslogd@stewie at Sun Jun  3 21:46:55 2012 ...

stewie kernel: Oops: 0000 [#1] SMP

 

Message from syslogd@stewie at Sun Jun  3 21:46:55 2012 ...

stewie kernel: Process fuser (pid: 8949, ti=e9e0e000 task=db56f020 task.ti=e9e0e000)

 

Message from syslogd@stewie at Sun Jun  3 21:46:55 2012 ...

stewie kernel: Code: c7 0f 84 ad 00 00 00 8d 40 40 66 c7 03 00 a0 e8 f4 b7 25 00 8b 47 04 3b 30 0f 83 8b 00 00 00 c1 e6 02 03 70 04 8b 06 85 c0 74 7f <f6> 40 24 01 74 05 66 81 0b 40 01 f6 40 24 02 74 05 66 81 0b c0

 

Message from syslogd@stewie at Sun Jun  3 21:46:55 2012 ...

stewie kernel: CR2: 0000000074617489

 

Message from syslogd@stewie at Sun Jun  3 21:46:55 2012 ...

stewie kernel: EIP: [<c10b3252>] proc_fd_instantiate+0x65/0xf7 SS:ESP 0068:e9e0feb4

 

Message from syslogd@stewie at Sun Jun  3 21:46:55 2012 ...

stewie kernel: Stack:

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.