Constant crashes


bamhm182

Recommended Posts

Hello,

 

I have been using Unraid for a long time on an R710 and a custom built server. I recently sold the R710 and moved everything over to the custom built server, and for some reason, I have had random crashes ever sense. The only thing I can think of is that the PSU isn't powerful enough. I have looked through the logs several times and I cannot seem to pinpoint what the issue is from there, but I'm hoping maybe someone else can before I go dump a bunch of money into a new PSU. It isn't ever doing anything insane when it crashes. I just have a couple VMs and Docker containers that are always running in the background. The only one that ever really uses a ton of juice is Plex.

 

Thank you in advance for anyone willing to help me look into this!

tardis-diagnostics-20200901-0036.zip

Link to comment
10 hours ago, bamhm182 said:

The only thing I can think of is that the PSU isn't powerful enough. I have looked through the logs several times and I cannot seem to pinpoint what the issue is from there, but I'm hoping maybe someone else can before I go dump a bunch of money into a new PSU.

Describe that PS with make and model number.  Do you know if it has a single or dual +12V buss. You do have a pile of disks on that server but a lot of them look like they might be SSD types.  Give us a breakdown as to type. 

 

Have you run a 14 hour memtst on the RAM.  (It is a boot option that can be selected before Unraid defaults as the startup choice.)

Link to comment

Sorry for the late response. I thought things were slowing down and I'd get a second to really dig into this problem. Boy was I wrong...

 

The power supply is a Corsair CS55M. The backplanes I have provide power over molex, so all my drives (aside from my m.2) are powered from the molex connectors on it. It appears to be a single-rail PSU. Just to see what my max-ish power consumption was, I started up a few hundred `yes` streams and made sure all of my disks were spun up. My UPS said I was pulling around 200W. At ~idle, I'm around 110W.

 

As far as disks go, I have the following:

 

7x 3.5" Spinning Disk (Molex Power)

1x 2.5" Spinning Disk (Molex Power)

2x NVMe (PCIe Power)

1x SATA M.2 (SATA Power)

3x 2.5" SSD (Molex Power)

 

I haven't done a memtest yet and the server is usually in use. I'll try to remember to start it before bed tonight.

 

I've enabled logging to my USB, but I can never find any sort of crash information there either. I'll do it again and post some information from around the time of the crash. It just kind of instantly craps out, then works when I reboot it again, which is making me think it's something like the PSU going out vs something to do with software. That said, I did run into an issue recently where it just REFUSED to boot. It was giving me exit_boot() and efi_main() failures after grub. I had to try like 10 times before it would finally boot. I don't know that that it related to this, though.

Link to comment

Setup the Syslog Server and upload the file after the next crash.  See here for details:

 

        https://forums.unraid.net/topic/46802-faq-for-unraid-v6/page/2/?tab=comments#comment-781601

 

 

If that does not catch something going on, Run Memtst ( an Unraid boot menu option) for 24 hours.

 

Next, try running it in the 'Safe' mode (another boot option).

 

If that doesn't show anything, it is probably time for the hardware swap type of diagnosis. 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.