• Unraid 6.11 Stable i915 GPU Hang


    ptbsare
    • Closed Urgent

    Steps to reproduce:

    ASROCK j3455m motherboard

    1. Start PLEX docker,(attaching device: /dev/dri/card0 and /dev/dri/renderD128)

    2.Play video with transcode on.

    3.GPU Hang, 

    logs:

    Sep 26 11:31:19 ptbsare-srv kernel: i915 0000:00:02.0: [drm] Resetting rcs0 for CS error
    Sep 26 11:31:19 ptbsare-srv kernel: i915 0000:00:02.0: [drm] Resetting rcs0 for CS error
    Sep 26 11:31:19 ptbsare-srv kernel: i915 0000:00:02.0: [drm] Resetting rcs0 for CS error
    Sep 26 11:31:19 ptbsare-srv kernel: i915 0000:00:02.0: [drm] Resetting vcs0 for CS error
    Sep 26 11:31:19 ptbsare-srv kernel: i915 0000:00:02.0: [drm] Plex Transcoder[21585] context reset due to GPU hang
    Sep 26 11:31:19 ptbsare-srv kernel: i915 0000:00:02.0: [drm] GPU HANG: ecode 9:4:40000000, in Plex Transcoder [21585]
    Sep 26 11:31:22 ptbsare-srv kernel: i915 0000:00:02.0: [drm] Resetting rcs0 for preemption time out
    Sep 26 11:31:22 ptbsare-srv kernel: i915 0000:00:02.0: [drm] GPU HANG: ecode 9:1:




    User Feedback

    Recommended Comments

    Sep 27 11:12:09 Xserve kernel: i915 0000:00:02.0: [drm] Resetting rcs0 for CS error
    Sep 27 11:12:09 Xserve kernel: i915 0000:00:02.0: [drm] Plex Transcoder[28979] context reset due to GPU hang
    Sep 27 11:12:09 Xserve kernel: i915 0000:00:02.0: [drm] GPU HANG: ecode 9:1:6978ba13, in Plex Transcoder [28979]
    Sep 27 11:12:16 Xserve kernel: i915 0000:00:02.0: [drm] Resetting rcs0 for preemption time out
    Sep 27 11:12:16 Xserve kernel: i915 0000:00:02.0: [drm] Plex Transcoder[28979] context reset due to GPU hang
    Sep 27 11:12:16 Xserve kernel: i915 0000:00:02.0: [drm] GPU HANG: ecode 9:1:8fd8ffff, in Plex Transcoder [28979]
    Sep 27 11:12:23 Xserve kernel: i915 0000:00:02.0: [drm] Resetting rcs0 for preemption time out
    Sep 27 11:12:23 Xserve kernel: i915 0000:00:02.0: [drm] Plex Transcoder[28979] context reset due to GPU hang
    Sep 27 11:12:23 Xserve kernel: i915 0000:00:02.0: [drm] GPU HANG: ecode 9:1:8fd8ffff, in Plex Transcoder [28979]
    Sep 27 11:12:30 Xserve kernel: i915 0000:00:02.0: [drm] Resetting vcs0 for preemption time out
    Sep 27 11:12:30 Xserve kernel: i915 0000:00:02.0: [drm] Plex Transcoder[28979] context reset due to GPU hang
    Sep 27 11:12:30 Xserve kernel: i915 0000:00:02.0: [drm] GPU HANG: ecode 9:4:cab6fff5, in Plex Transcoder [28979]
    Sep 27 11:12:37 Xserve kernel: i915 0000:00:02.0: [drm] Resetting vecs0 for preemption time out
    Sep 27 11:12:37 Xserve kernel: i915 0000:00:02.0: [drm] Plex Transcoder[28979] context reset due to GPU hang
    Sep 27 11:12:37 Xserve kernel: i915 0000:00:02.0: [drm] GPU HANG: ecode 9:8:cffbffff, in Plex Transcoder [28979]
    Sep 27 11:14:10 Xserve kernel: i915 0000:00:02.0: [drm] Resetting rcs0 for CS error
    Sep 27 11:14:10 Xserve kernel: i915 0000:00:02.0: [drm] Resetting rcs0 for CS error
    Sep 27 11:14:17 Xserve kernel: i915 0000:00:02.0: [drm] Resetting rcs0 for preemption time out

     

    got the same thing on my J3355 CPU.

    Edited by muzo178
    Link to comment
    11 hours ago, limetech said:

    Bottom of this post says fixed in kernel 5.19.11

     

    True, maybe not the same issue them, @ich777any idea since you're the one who found those links?

    Link to comment
    7 minutes ago, JorgeB said:

    True, maybe not the same issue them, @ich777any idea since you're the one who found those links?

    I will create a custom build with Kernel 5.19.12 (latest stable) later today and see if it fixes the issue on my Cherry Trail machine too.

    Link to comment
    15 hours ago, ptbsare said:

    Unraid 6.11.1 is using 5.19.12 kernel.

    Just realizing that earlier I misread this as "Unraid 6.11.0 is using 5.19.12 kernel", what a ding dong, sorry for the extra work @ich777, just needed to wait for 6.11.1.

    • Like 1
    Link to comment
    1 hour ago, JorgeB said:

    sorry for the extra work @ich777

    No worries.

     

    Was also curious if it‘s working and recompiling the Kernel or creating a Unraid version with another Kernel version is not that of a big deal for me. 😅

    • Like 1
    Link to comment

    I'm not convinced this is closed. I've been having these hangs with an i9 9900 on version 6.11.5.

     

    Jan 28 21:20:00 Skippy2 kernel: i915 0000:00:02.0: [drm] Resetting rcs0 for CS error
    Jan 28 21:20:00 Skippy2 kernel: i915 0000:00:02.0: [drm] frigate.detecto[2023] context reset due to GPU hang
    Jan 28 21:20:00 Skippy2 kernel: i915 0000:00:02.0: [drm] GPU HANG: ecode 9:1:8ed1fff2, in frigate.detecto [2023]

     

    this is causing frigate to stop working and restart. Happens every hour or so. 

    Link to comment


    Join the conversation

    You can post now and register later. If you have an account, sign in now to post with your account.
    Note: Your post will require moderator approval before it will be visible.

    Guest
    Add a comment...

    ×   Pasted as rich text.   Restore formatting

      Only 75 emoji are allowed.

    ×   Your link has been automatically embedded.   Display as a link instead

    ×   Your previous content has been restored.   Clear editor

    ×   You cannot paste images directly. Upload or insert images from URL.


  • Status Definitions

     

    Open = Under consideration.

     

    Solved = The issue has been resolved.

     

    Solved version = The issue has been resolved in the indicated release version.

     

    Closed = Feedback or opinion better posted on our forum for discussion. Also for reports we cannot reproduce or need more information. In this case just add a comment and we will review it again.

     

    Retest = Please retest in latest release.


    Priority Definitions

     

    Minor = Something not working correctly.

     

    Urgent = Server crash, data loss, or other showstopper.

     

    Annoyance = Doesn't affect functionality but should be fixed.

     

    Other = Announcement or other non-issue.