• 6.10.3 Samsung 980 Temp Warning


    uefcommand
    • Minor

    I just upgraded to 6.10.3 from 6.9.3 last night and am getting a temp warning for the pulling interval on my Samsung 980 Cache drive saying its at 84C it will go from 44C to 84C in 10 seconds and back again. I changed to pulling interval to 10 seconds to watch the behavior.  I believe this is a bug I have also found many post on this forum and reddit about this as well. Any guidance would be appreciated. Thank you.

    diagnostics-20220622-0902.zip

    • Like 1
    • Upvote 1



    User Feedback

    Recommended Comments

    6 minutes ago, uefcommand said:

    How is it a device issue when it only started to occur with 6.10.3?

    Check the link, it only happens with newer kernels, still a device problem, or at best a kernel problem with that device, it happens with different Linux distros.

    Link to comment

    I updated to the newest firmware 2B4QFXO7 for the device and did see the same behavior and with in 10 seconds it went back down to 40 C. This does not seem possible.

    Link to comment
    On 7/21/2022 at 11:42 AM, FreakyUnraid said:

    Where and how do I use this command in Unraid? 

    Not sure that will help, but it won't hurt, do this:

     

    on the main GUI page click on flash, scroll down to "Syslinux Configuration", make sure it's set to "menu view" (on the top right) and add this to your default boot option, after "append initrd=/bzroot"

    e.g.:

    append initrd=/bzroot nvme_core.default_ps_max_latency_us=0


    Reboot and see if it makes a difference.

     

    Link to comment
    On 7/21/2022 at 11:42 AM, FreakyUnraid said:

    I found the following post that claims to 'fix' this issue

    Did it work?

    Link to comment
    On 8/1/2022 at 2:34 PM, JorgeB said:

    Did it work?

    I can confirm this solved the same error I had with the Samsung 980 500GB NVMe M.2 drive.

     

    Thank you!

    • Like 1
    Link to comment
    On 7/24/2022 at 2:02 PM, JorgeB said:

    Not sure that will help, but it won't hurt, do this:

     

    on the main GUI page click on flash, scroll down to "Syslinux Configuration", make sure it's set to "menu view" (on the top right) and add this to your default boot option, after "append initrd=/bzroot"

    e.g.:

    append initrd=/bzroot nvme_core.default_ps_max_latency_us=0


    Reboot and see if it makes a difference.

     

    I have a 980 SSD and have been experiencing the exact same issue with 6.11.1. I was getting crazy temperature spike reports that showed cache was double it's normal operating temperature, 183F - it immediately struck me as unlikely to be that hot. I physically inspected the drive the first time it happened and it was fine, there's no way it was/had been anything close to 180F. I also noticed if I spun down the array the temperature would immediately report back to typical range (93F) which also seems unlikely. I just applied the above, thank you for the information! Hoping it resolves the issue...

     

    Also - I only saw this issue intermittently, it only happened 2-3 times over the course of the last week and a half. I upgraded for 6.8.x about a week ago (I had been skipping upgrades pending some new hardware including this SSD lol).

    • Like 1
    Link to comment

    So i noticed that I now have this problem after going from 6.8.3 and umping to 6.11.3. Mine will go from 110F to 122F 3 or 4 times a day. And i feel like is not doing anything? I have plex, Sonarr ,Radar and Homa Assistant VM running. I thought it was Tdarr that was doing but I only have Tdarr run for 3 hours each night at midnight to 3am. Any ideas what else I would do?

     

    Running  2 in a pool for cache
    Samsung SSD 970 EVO Plus 2TB

    on a Asus TUF GAMING X570-PLUS motherboard

    AMD Ryzen 5 3600 6-Core @ 3600 MHz

    Link to comment
    On 7/21/2022 at 12:42 PM, FreakyUnraid said:

    @JorgeB I found the following post that claims to 'fix' this issue https://forum.proxmox.com/threads/smart-error-health-detected-on-host.109580/#post-475308

     

    Where and how do I use this command in Unraid? 

    Hi Jorge,

     

    I've been facing this temperature issues since i've got a 980 for the main cache poll, my Syslinux configuration looks different from what you describe on your post, here is a screenshot of it because i'm not quite sure where to add that line, could you please take a look?

     

    By the way, after checking logs and time of the temperature warnings i found the trigger to be Flash Backup process and it always went down to 34ºC exactly 30 minutes after the warning, I was very worried because  even added a thick heatsink to the drive but nothing changed.

     

    Thx!

     

     

    image.png

    Edited by surferjsmc
    Link to comment
    18 minutes ago, surferjsmc said:

    here is a screenshot of it because i'm not quite sure where to add that line, could you please take a look?

    add it to the end of the append line in the "Unraid OS" boot option (the one in green).

    • Thanks 1
    Link to comment

    No more temperature alerts in the last 15 hours so i'm confident to say this fix works :D 

    Running Unraid 6.11.5, cache drive Samsung 980 500GB.

     

    Thank you all.

    • Like 1
    Link to comment
    On 7/24/2022 at 9:02 PM, JorgeB said:
    append initrd=/bzroot nvme_core.default_ps_max_latency_us=0

     

    been more than 20 hours that 3 Samsung 980s stopped throwing 84c warnings hourly, fix works
    Running Unraid 6.11.5

    3xSamsung 980 1tb NVMEs

    2xSamsung 980 PRO 1tb NVMEs

    • Like 1
    Link to comment

    I was having this issue too.  Further along the article linked above talks about updating the firmware of the drive to fix this as well.  To me that seems like a better fix...  But I'm not sure how I would go about updating the FW! It seems a little scary! 🙂

    Link to comment
    5 hours ago, jbuszkie said:

    I was having this issue too.  Further along the article linked above talks about updating the firmware of the drive to fix this as well.  To me that seems like a better fix...  But I'm not sure how I would go about updating the FW! It seems a little scary! 🙂


    Updating isn't difficult.  You can do it through a flash boot disk with something like ventoy or run it from unraid. Some shared instructions in a Reddit post: Search /r/unraid for Update Samsung SSD Firmware

    • Upvote 1
    Link to comment

    I was able to update using the ISO image on the samsung website

    I got it from here.  It scared the hell out of me as it warned  ~"This will or may erase all data"

    So I just did one at a time (I have two) and each were fine.  No data was erased!

    Some folks on other website claimed that the ISO didn't work and described how to do it manually.

    I didn't have to go that route, thankfully...

    • Like 1
    Link to comment

    Someone asked for the steps I used.  So here they are.,

     

    1. download the ISO image from the Samsung website. (Link above - make sure you grab the correct image)

    2. Make a bootable USB flash drive from the ISO. (I used Rufus - but I'm sure there are others) I did this on Windows.

    3. If you feel paranoid, I would make sure you have a everything on your cache drive backed up. - I didn't lose data but better to be safe than sorry.

    4. Shutdown Unraid.

    5. Yank Unraid flash drive and put in the Samsung ISO image drive you just created.

    6. boot the Samsung image and follow the prompts to update the firmware. Some folks had issues with a USB keyboard  in the past.  I had no issues.

    7. Power down the unraid machine again

    8. take out the Samsung flash drive and put back in the Unraid flash drive.

    9. Power on the Unraid machine and verify you have the updated firmware.

    Edited by jbuszkie
    • Like 1
    • Upvote 2
    Link to comment


    Join the conversation

    You can post now and register later. If you have an account, sign in now to post with your account.
    Note: Your post will require moderator approval before it will be visible.

    Guest
    Add a comment...

    ×   Pasted as rich text.   Restore formatting

      Only 75 emoji are allowed.

    ×   Your link has been automatically embedded.   Display as a link instead

    ×   Your previous content has been restored.   Clear editor

    ×   You cannot paste images directly. Upload or insert images from URL.


  • Status Definitions

     

    Open = Under consideration.

     

    Solved = The issue has been resolved.

     

    Solved version = The issue has been resolved in the indicated release version.

     

    Closed = Feedback or opinion better posted on our forum for discussion. Also for reports we cannot reproduce or need more information. In this case just add a comment and we will review it again.

     

    Retest = Please retest in latest release.


    Priority Definitions

     

    Minor = Something not working correctly.

     

    Urgent = Server crash, data loss, or other showstopper.

     

    Annoyance = Doesn't affect functionality but should be fixed.

     

    Other = Announcement or other non-issue.