• [6.9.x - 6.11.x] intel i915 module causing system hangs with no report in syslog (not alder lake)


    Tristankin
    • Minor

    Since the 5.x kernel based releases many users have been reporting system hangs every few days once the i915 module is loaded.

    With reports from a few users detailed in the thread below we have worked out that the issue is caused by the i915 module and is a persistent issue with both the 6.9.x release and 6.10 release candidates.


    The system does not need to be actively transcoding for the hang to occur. 6.8.3 does not have this issue and is not hardware related. Unloading the i915 module stops the hangs. Hangs are still present in 6.10.0RC2. I can provide a list of similar reports if required.

    • Like 8
    • Thanks 1
    • Haha 1



    User Feedback

    Recommended Comments



    2 minutes ago, muzo178 said:

     

    did you try turning off hdr tone mapping? if you are using plex that is… with that off, it works for me.

     

    Just disabled and it works, but now washed out colours. So must be related to HDR tone mapping.

    Link to comment
    16 minutes ago, squiddles88 said:

    Just disabled and it works, but now washed out colours. So must be related to HDR tone mapping.

    Yeah.. used to work with 6.8.3. If we can solve this, we’re golden…

    Link to comment

    Well I was good for over a month and then the system has started freezing again every 12 hours or so. I have migrated from binhex-plex to linuxerver/plex. I have also turned off HRD tone mapping. Fingers crossed I can return to stability again.

    Link to comment
    6 hours ago, Tristankin said:

    I have migrated from binhex-plex to linuxerver/plex.

    Why not use the official container instead of using third party ones?

    Link to comment
    22 hours ago, ich777 said:

    Why not use the official container instead of using third party ones?

    I tried to change the repository to keep config on the same container. Official would not boot and linuxserver required me to set up from scratch anyway. Lost all my play history but meh.

    Hasn't crashed yet though.

    Link to comment
    3 hours ago, Tristankin said:

    I tried to change the repository to keep config on the same container. Official would not boot and linuxserver required me to set up from scratch anyway.

    Interesting, this is the first time I hear about that, they are usually all intercompatible.

    Link to comment
    2 hours ago, ich777 said:

    Interesting, this is the first time I hear about that, they are usually all intercompatible.


    I would have hoped so. Folder structure in appdata is different.

    Link to comment
    2 hours ago, Tristankin said:


    I would have hoped so. Folder structure in appdata is different.


    You should be able to copy the key data folders over following the guides they have: https://support.plex.tv/articles/201370363-move-an-install-to-another-system/
     

    Odd that it's giving you issues changing containers. I'd try stopping the old container, move the folder or target a new folder with your new container. Start the new one, confirm it boots, then shut it down and copy over the relevant data.

    It might be worth purging plex cache as well. Then see if it starts.

    • Thanks 1
    Link to comment
    1 hour ago, flyize said:

    Makes me wonder if there was somehow something wrong with your Plex install.


    linuxserver puts the pms install in the following location

    image.thumb.png.2d0f9302772b36b990550b6d38d293d4.png

    Where binhex was in the top level directory

    image.thumb.png.95e0429b2369de04bb1133605ffa3842.png

     

    Is what it is, not the end of the world. 100% less annoying than a server that needs restarting twice a day.

    Edited by Tristankin
    Link to comment
    12 minutes ago, Tristankin said:

    linuxserver puts the pms install in the following location

    But this looks to me that you only have to do your mappings in the Docker template differently and then it should work because they all use Plex as the base app.

    I only assumed that it is working because some people reported that they successfully switched from the Linuxserver or Binhex container over to the official one.

    Link to comment

    I mean essentially starting fresh. If it works now, then maybe something was wrong with your installation. That might explain why others weren't really seeing the same issue.

    Link to comment
    3 minutes ago, flyize said:

    I mean essentially starting fresh. If it works now, then maybe something was wrong with your installation. That might explain why others weren't really seeing the same issue.


    Could also be turning off HDR tone mapping, vt-d, mounting directly to the cache drive, or a combination of factors. 

    0 issues with this install on 6.8.3 with a year and a half with 0 crashes, anything with a 5.x kernel it fails with. I am pretty sure the install was fine but I'm continuing to try.
     

    Would be interesting if it is just the tone mapping, but I can't be 100% sure till I know I have a stable system again and can start testing those features.

    • Like 1
    Link to comment
    6 hours ago, Tristankin said:


    I would have hoped so. Folder structure in appdata is different.

    The way I migrated from linuxserver to official plex was to first backup my appdata, then edit my linuxserver Plex docker image. 

     

    Under repository, I changed the value to plexinc/pms-docker:latest, then hit done. It redownloaded everything from plex official, but all my settings and paths I setup under linuxserver remained the same, and Plex started with no issues. 

     

    However, I'm still crashing repeatedly under Plex transcoding. I've tried disabling HDR Tone mapping, removing GPU Top and statistics, changing appdata to cache, and touching the i915 file. I just ordered dummy plugs (I have a TV plugged into my server, Qnap TS-453Be, Intel J3455), but only 1 of the 2 hdmi ports are plugged in, so I'll try plugging in the other one. If that fails, I'll try disabling vt-x, then all else fails I'll install 6.8.3.

     

    Since this is a fresh install of Unraid, can I just download 6.8.3 and copy it over to the flash drive? Or is there other steps I need to take? The main reason for this box is Plex and file serving, and after a month and a half of troubleshooting, I'm about to give up.

     

    Any thoughts? I posted a standalone thread previously, and that just withered and died without any responses. I'd appreciate any help. I've included diagnostics and logs, though the logs just fall off a cliff when rebooting without any indicators

     

     

    Screenshot 2023-03-14 11.59.12 AM.png

    babybertha-diagnostics-20230314-1219.zip syslog (2)

    Edited by NullZeroNobody
    Added diagnostics
    • Upvote 1
    Link to comment
    On 3/15/2023 at 3:15 AM, NullZeroNobody said:

    I just ordered dummy plugs (I have a TV plugged into my server

     

    Is the tv always switched on? The dummy plug could very well fix your issues.

    Link to comment

    I just upgraded my motherboard and seem to be having this same issue. I have been reading that the only potential solution is to downgrade to 6.8.* but I am not sure how to do that while preserving my current configuration. Is the Unraid team working on this or are we being ignored? This seems to be a critical issue for a ton of folks. 

     

    • Is there a fix?
    • If there is no fix, is there a way to downgrade whilst keeping existing array configuration/data?

     

    My infrastructure is down and I cannot access any of my servers or data on that machine; complete productivity halt. I am not sure what to do but this seems like a big oversight on Unraid's part and I am curious when the ETA for the defect fix will be? If there is no ETA or development thread we can follow then I guess Unraid is not a viable option for anyone who has an Intel board requiring the i915 module (which appears to be a lot of folks).

     

    Is this being worked on and where can I see the progress? 

    Link to comment
    On 3/15/2023 at 8:02 PM, Tristankin said:

     

    Is the tv always switched on? The dummy plug could very well fix your issues.

    No, the TV isn't always switched on, but CEC is turned on in the TV, so I believe there is still signal sent. However, I swapped out the HDMI cable for dummy plugs in both HDMI ports, still reboots.

     

    • Unpinned CPU-0 for both VM and Plex, still reboots.

     

    • Removed hardware decoding from Plex completely, (---/dev/dri), still reboots.

     

    At this point, I ordered new ram and a new power brick. The system pulls 50-60 watts at full tilt, and power brick is rated for 95W, but is also 5 years old and had been on constantly prior to installing Unraid. 

     

    Lately, it's been rebooting pretty much every time I use plex, either right away, or after an episode or two. I've had it reboot on me 4 times in one day!

     

    Curious note, couple of days ago, instead of freezing as it normally does, requiring me to shut it off remotely with a smart plug and starting it back up, the server rebooted itself, going back into Unraid. It was still an unclean reboot, since it started a parity check, but that was new. Happeend twice, then back to the usual freezing and requiring manual restart.

    syslog

    Link to comment
    3 hours ago, EntropyInjection said:

    Is the Unraid team working on this or are we being ignored? This seems to be a critical issue for a ton of folks. 

    Can you please provide your Diagnostics?

    You haven't even told what motherboard you've upgraded to.

     

    3 hours ago, EntropyInjection said:

    My infrastructure is down and I cannot access any of my servers or data on that machine; complete productivity halt.

    Why? How do you access your server? Many users including me are using the Unraid server with a Intel GPU without any issues.

    Link to comment

    Hi thanks for the response. I can't access the server or data because it will not boot. Its fine, I can't wait to find a solution so I purchased different hardware and will transfer the drives over. It appears to be something specific with certain Intel GPUs during the loading of the i915 module. It seems to be a known issue (it was just unknown to me). There is a plethora of inquiries on this topic, a great many related to the Unraid 6.11 upgrade. 

     

    It was just frustrating to be blocked by something like that. Apologies for the panic and frustration. I am hoping the new hardware will solve the issue. 

    Link to comment
    39 minutes ago, EntropyInjection said:

    I can't access the server or data because it will not boot.

    No, wait, that's not what this issue is for...

    There must besomething else going on on your system, can you provide a photo when the server is booting?

     

    40 minutes ago, EntropyInjection said:

    It seems to be a known issue (it was just unknown to me).

    No, the issue that you describe is completely different and is not related to this issue, you even can prevent the module from loading.

    • Like 1
    Link to comment
    7 hours ago, NullZeroNobody said:

    Curious note, couple of days ago, instead of freezing as it normally does, requiring me to shut it off remotely with a smart plug and starting it back up, the server rebooted itself, going back into Unraid. It was still an unclean reboot, since it started a parity check, but that was new. Happeend twice, then back to the usual freezing and requiring manual restart.


    That is a weird one. My system will always respond to the long power button push to force a power off. 

    I am up to 9 days of uptime now since I reinstalled fresh plex and turned off HDR tone mapping. This has happened in the past and took about a month before it started freezing again last time after changing the mount point and turning off vt-d. I am going to wait for 2 months and then try turning features back on to see if i can hone in on the issue with my particular system.

    Link to comment
    19 hours ago, NullZeroNobody said:

    requiring me to shut it off remotely with a smart plug 

    syslog

    Hmmm, that's actually a great idea for when I'm traveling as a 'just in case' thing. Thanks!

    Link to comment
    22 hours ago, ich777 said:

    No, wait, that's not what this issue is for...

    There must besomething else going on on your system, can you provide a photo when the server is booting?

     

    No, the issue that you describe is completely different and is not related to this issue, you even can prevent the module from loading.

     

    I see, thanks for letting me know. I would provide a screenshot but I already swapped out the hardware.

     

    I am not sure how to disable these modules but I have another motherboard I was going to use to upgrade my other server machine. If I can simply prevent that module from loading then I should be able to upgrade the other server. Would I just look into disabling that module for linux and manually configure the linux OS itself so it does not load? If not could you point me in the right direction?

     

    Thanks again

    Link to comment

    Not sure I am out of the woods yet, Got another freeze after 15 days uptime.

    I have just turned on syslog to flash so I will see if there is anything recorded but I doubt it.

    Edited by Tristankin
    Link to comment



    Join the conversation

    You can post now and register later. If you have an account, sign in now to post with your account.
    Note: Your post will require moderator approval before it will be visible.

    Guest
    Add a comment...

    ×   Pasted as rich text.   Restore formatting

      Only 75 emoji are allowed.

    ×   Your link has been automatically embedded.   Display as a link instead

    ×   Your previous content has been restored.   Clear editor

    ×   You cannot paste images directly. Upload or insert images from URL.


  • Status Definitions

     

    Open = Under consideration.

     

    Solved = The issue has been resolved.

     

    Solved version = The issue has been resolved in the indicated release version.

     

    Closed = Feedback or opinion better posted on our forum for discussion. Also for reports we cannot reproduce or need more information. In this case just add a comment and we will review it again.

     

    Retest = Please retest in latest release.


    Priority Definitions

     

    Minor = Something not working correctly.

     

    Urgent = Server crash, data loss, or other showstopper.

     

    Annoyance = Doesn't affect functionality but should be fixed.

     

    Other = Announcement or other non-issue.