HELP - Cannot access GUI - CACHE M2 drive is likely dead - NOW what?


emod

Recommended Posts

I cannot access GUI - CACHE M2 drive is likely dead, USB might be as well - NOW what? (specific questions are further below)

 

I think this time, something really bad went wrong on my UNRAID server. I need help in identifying how to find where the problem is. I appreciate help.

 

UNRAID basic config: 1 Parity drive, 2 HDDs, 1 M.2 SSD (cache), on latest development branch: 6.10.0-rc2

-CACHE drive contains apps, and some other data....but all personal data was on arrays and parity.

I have a backup of USB BOOT, but not CACHE.

 

Here is what happened (I'll give all info to the best of my knowledge):

  1. During copying 25GB data (Resilio sync and Plex indexing) on my UNRAID box...CACHE drive got hot to over 67C for 5mins or so, CPU temps went to 100C for same amount of time.
    1. I used "Edust for electronics" duster to try to move air, cool down motherboard and remove some excess dust.
  2. I was unable to access Docker to shut down the offending processes (Resilio sync and Plex)
  3. I was unable to reboot from within GUI
  4. I then turned off the server (directly on the box)
  5. I restarted the server but cannot access GUI (browser said "Problem loading page")
  6. I could not PING server from my PC.
  7. On my router I saw that Ethernet link to UNRAID is UP.
  8. I thought USB might be dead, I created a new UNRAID USB boot drive (from UNRAID website), and switched it with the old one -> restarted UNRAID -> still cannot access GUI.
  9. I connected mini-monitor, keyboard and mouse directly to the server; then restarted the server
  10. On the monitor I saw an error saying :"CPU FAN failure, press F1 - I did to no avail"
  11. I open server case and notice that CPU fan is not moving at all. I touched it a bit, and it started working (there was some dust around the propeler so that might have prevented normal function)
  12. I reboot the server, and now server enters BIOS mode (by itself). I exited BIOS, and the server restarted on its own and reenters the BIOS.
  13. I enter BIOS and notice that:
    1. Parity drive and 2 array drives are identified
    2. M.2 slot (where SSD cache is): says N/A
    3. CPU is running, CPU fan is running
    4. Memory sticks are identified.
    5. All other Hardware readings, incl. temperatures for CPU and MoBo seem normal.

 

MY ASSUMPTIONS:

  • M.2 SSD CACHE drive is dead
  • Bootable USB might be dead too -> I assume that only bootable USB is required to bring system up (even without CACHE)...but when I inserted a brand new USB with fresh UNRAId, the system still wasn't accessible via GUI or via monitor.
  • The fact that server, when turned on, goes instantly into BIOS mode, and when I exit BIOS mode, the server restarts into BIOS again - tells me there is something wrong going on but I cannot figure out whether it's USB, motherboard, or SSD CACHE?

 

QUESTIONS:

  1. I assume CACHE is dead. How do I verify that?
    1. For example, can UNRAID GUI start with USB only (without CACHE)
    2. Is it possible that even some components on motherboard are dead?
  2. If CACHE is dead, and once I buy a new CACHE, what is the process for installing it (I assume I'll have to reinstall all apps, since nothing from CACHE was backed up on array)
    1. Do I just switch the old for new, and restart the server?
  3. Is there some hardware or tool I can buy to try to recover data from CACHE?

 

 

Thanks.

Edited by emod
Link to comment
48 minutes ago, emod said:

 

QUESTIONS:

  1. I assume CACHE is dead. How do I verify that?
    1. For example, can UNRAID GUI start with USB only (without CACHE)
    2. Is it possible that even some components on motherboard are dead?
  2. If CACHE is dead, and once I buy a new CACHE, what is the process for installing it (I assume I'll have to reinstall all apps, since nothing from CACHE was backed up on array)
    1. Do I just switch the old for new, and restart the server?
  3. Is there some hardware or tool I can buy to try to recover data from CACHE?

 

 

Thanks.

 

1. Cache drive is not required to boot into Unraid GUI.  Sounds like you may have overheated m.2 drive, causing possible failure.  It is also possible your board is triggering thermal protection or will not boot do to a bad fan. It is unlikely the USB failed as well.

 

2. If cache is dead and you do not have a backup of appdata folder you will need to install any dockers and there respective configs from scratch. As long as your cache drive is not encrypted (provided it is not bad) the data would be accessible. How you access this would depend on its file system. I would try removing the m.2 from the system and see if you can get things to boot.

Edited by xxnumbxx
spelling
Link to comment

I think you are correct. Mobo was getting hot, and might have triggered some safety mechanism. I left the server overnight in OFF position. Turned it back in the morning, and literally everything was working fine. Even the mover moved everything (including Apps backup) from CACHE to array. Scary experience but a good lesson.

 

I have had issues with MOVER not working since Sept 2021, and after upgrading UNRAID OS to the next development branch, it suddenly started to work again (that was a pleasant surprise).

Edited by emod
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.