• [6.9.0-Beta1] Locks up about once per day


    loftyDan
    • Minor

    Never had any hard lockup problems on 6.8.2 or 6.8.3

     

    Twice in 2 days my system locked up completely.  Today when I looked at the screen, lots of rcu_sched self-detected stall on CPU

    and

    dump sack messages on the screen.  Both times, the system wasn't under load.  It's a fairly basic plex server, but wasn't being used at the time.

     

    System is:

     

    M/B: ASRock B450 Gaming-ITX/ac Version

    BIOS: Version P3.70. Dated: 11/18/2019

    CPU: AMD Ryzen 5 2600X

    RAM: 16 GiB DDR4

     

    Here is the message that repeats over and over.  After the first crash, I had enabled persistent logging, so I'll attach the full syslog in case that helps, in addition to a diagnostic log I made after reverting the OS to 6.8.2 (didn't realize it wasn't reverting to 6.8.3 or I'd have done that).

     

    Mar 11 20:00:16 Tower kernel: RSP: 002b:0000149d98b56b08 EFLAGS: 00000202 ORIG_RAX: 0000000000000003
    Mar 11 20:00:16 Tower kernel: RAX: 0000000000000000 RBX: 0000149d98b56b40 RCX: 0000149d9949c727
    Mar 11 20:00:16 Tower kernel: RDX: 0000000000000001 RSI: 0000000000000000 RDI: 0000000000000008
    Mar 11 20:00:16 Tower kernel: RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000001
    Mar 11 20:00:16 Tower kernel: R10: 0000000000000003 R11: 0000000000000202 R12: 0000000000404a74
    Mar 11 20:00:16 Tower kernel: R13: 0000149d98b56b40 R14: ffffffffffffff78 R15: 0000149d880095d0
    Mar 11 20:03:16 Tower kernel: rcu: INFO: rcu_sched self-detected stall on CPU
    Mar 11 20:03:16 Tower kernel: rcu:     2-....: (5818729 ticks this GP) idle=1d6/1/0x4000000000000002 softirq=20367780/20367780 fqs=1447357 
    Mar 11 20:03:16 Tower kernel:     (t=5820105 jiffies g=14752009 q=557393)
    Mar 11 20:03:16 Tower kernel: NMI backtrace for cpu 2
    Mar 11 20:03:16 Tower kernel: CPU: 2 PID: 11502 Comm: emhttpd Tainted: G      D           5.5.8-Unraid #1
    Mar 11 20:03:16 Tower kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./B450 Gaming-ITX/ac, BIOS P3.70 11/18/2019
    Mar 11 20:03:16 Tower kernel: Call Trace:
    Mar 11 20:03:16 Tower kernel: <IRQ>
    Mar 11 20:03:16 Tower kernel: dump_stack+0x64/0x7c
    Mar 11 20:03:16 Tower kernel: ? lapic_can_unplug_cpu+0x8e/0x8e
    Mar 11 20:03:16 Tower kernel: nmi_cpu_backtrace+0x73/0x85
    Mar 11 20:03:16 Tower kernel: nmi_trigger_cpumask_backtrace+0x56/0xd3
    Mar 11 20:03:16 Tower kernel: rcu_dump_cpu_stacks+0x89/0xb0
    Mar 11 20:03:16 Tower kernel: rcu_sched_clock_irq+0x1e4/0x513
    Mar 11 20:03:16 Tower kernel: update_process_times+0x1f/0x3d
    Mar 11 20:03:16 Tower kernel: tick_sched_timer+0x33/0x62
    Mar 11 20:03:16 Tower kernel: __hrtimer_run_queues+0xb7/0x10b
    Mar 11 20:03:16 Tower kernel: ? tick_sched_do_timer+0x39/0x39
    Mar 11 20:03:16 Tower kernel: hrtimer_interrupt+0x8d/0x160
    Mar 11 20:03:16 Tower kernel: smp_apic_timer_interrupt+0x6a/0x7a
    Mar 11 20:03:16 Tower kernel: apic_timer_interrupt+0xf/0x20
    Mar 11 20:03:16 Tower kernel: </IRQ>
    Mar 11 20:03:16 Tower kernel: RIP: 0010:native_queued_spin_lock_slowpath+0xa1/0x1f2
    Mar 11 20:03:16 Tower kernel: Code: c2 8b 07 30 e4 09 d0 a9 00 01 ff ff 89 44 24 04 74 0c 0f ba e0 08 72 1e c6 47 01 00 eb 18 85 c0 74 0a 8b 07 84 c0 74 04 f3 90 <eb> f6 66 c7 07 01 00 e9 2b 01 00 00 48 c7 c0 00 3c 02 00 65 48 03

    syslog.crash.march11.log tower-diagnostics-20200311-2045.zip




    User Feedback

    Recommended Comments



    Join the conversation

    You can post now and register later. If you have an account, sign in now to post with your account.
    Note: Your post will require moderator approval before it will be visible.

    Guest
    Add a comment...

    ×   Pasted as rich text.   Restore formatting

      Only 75 emoji are allowed.

    ×   Your link has been automatically embedded.   Display as a link instead

    ×   Your previous content has been restored.   Clear editor

    ×   You cannot paste images directly. Upload or insert images from URL.


  • Status Definitions

     

    Open = Under consideration.

     

    Solved = The issue has been resolved.

     

    Solved version = The issue has been resolved in the indicated release version.

     

    Closed = Feedback or opinion better posted on our forum for discussion. Also for reports we cannot reproduce or need more information. In this case just add a comment and we will review it again.

     

    Retest = Please retest in latest release.


    Priority Definitions

     

    Minor = Something not working correctly.

     

    Urgent = Server crash, data loss, or other showstopper.

     

    Annoyance = Doesn't affect functionality but should be fixed.

     

    Other = Announcement or other non-issue.