Skip to content
View in the app

A better way to browse. Learn more.

Unraid

A full-screen app on your home screen with push notifications, badges and more.

To install this app on iOS and iPadOS
  1. Tap the Share icon in Safari
  2. Scroll the menu and tap Add to Home Screen.
  3. Tap Add in the top-right corner.
To install this app on Android
  1. Tap the 3-dot menu (⋮) in the top-right corner of the browser.
  2. Tap Add to Home screen or Install app.
  3. Confirm by tapping Install.

Unraid Crashing

Featured Replies

Hey guys, so usually my server is pretty good when it comes to up-time. Last week it had some issues with it randomly crashing though. I was loosing everything, no webgui, no SSH, no keyboard input, nothing. I had to hard reset the power once. Everything came back fine and worked after that. Since last week, have replaced my motherboard and CPU. I am still having these issues. I had these issues on 6.6.6 and I upgrade to the newest RC and I am still having these issues. Trying to figure out whats going on.

 

My up-time is max of 6 hours at the moment, the server will just flat out crash randomly, or when I am starting up VM's or Dockers. I have attached the logs I was able to grab after the last crash. I have also ran the "fix common problems" plugin and it finds nothing. I also have updated my bios to the latest and turned off C states on my motherboard.

 

Here are my specs.

Ryzen 7 1700 ( overclocked to 3.5Ghz )

ASUS PRIME b350-plus

 

My first thought, was maybe because of my overclock on my CPU. However, this happened to me prior to the new CPU and Motherboard. On a stock AMD chip. So as of right now, I am going to turn the overclock off after I reboot my server just to make sure this isn't it.

 

Thanks!

 

tower-diagnostics-20190310-2224.zip

  • Community Expert

Be sure to run memtst (A boot menu option, selectable when Unraid is booting) for 24 hours.    NOTE:  memtst will only test non-ECC memory.

 

Power supplies have been the cause of this type of problem in the past.  Part swap-out is the only real way to test PS's.  

  • Author

I do have a PSU tester, won't that be able to tell me if my PSU is bad?

I can do the memory test as well, brand new memory though. Would really suck if I got the bad batch.

 

EDIT:

By the way, the server literally just crashed about 5 minuets ago as I was provisioning a windows server vm. (2 cores, 2GB ram)

Just wanted to update the post. It crashed right when I was hitting the field to search for my install media. It started to search the iso's folder and that's when I lost connectivity. I almost feel like this is a bad sata card or something to do with a drive, since it seems that whenever the data is accessed it throws Unraid for a loop.

Edited by Micaiah12
Added an update

  • Community Expert

Problem with the PS Tester is that is most of them only find the ones that  are completely bad.  Not the flaky ones.  And those flaky ones are the ones which are the troublesome problem givers.  By the way, have you consulted the Ryzen thread to see if there are any special BIOS settings that your particular CPU needs any special BIOS settings.  

 

Bad RAM  is not common but it is one of the cheapest possible causes to test out.  It only costs time...

11 hours ago, Micaiah12 said:

Ryzen 7 1700 ( overclocked to 3.5Ghz )

Don't overclock it. Overclocking is for gaming rigs, not servers where, above anything, you want stability. It will precision boost all cores to 3.2 GHz for most of the time under load - the Wraith Spire is good enough for that. Don't overclock the memory controller either. So 2666 MT/s or lower, depending on your RAM configuration. Install the latest BIOS and look for the Power Supply Idle control. The default is low current - set it to normal current.

 

20 minutes ago, Micaiah12 said:

brand new memory though. Would really suck if I got the bad batch.

Brand new is a good time to test it. Bad DIMMs happen but being new it will be easy to replace. In any case, reputable brands have lifetime warranties.

  • Community Expert

Another RAM Question:  How many sticks have you installed?  If more than two, double check whether the MB manufacturer has a list of RAM recommended part numbers for that board.  A few motherboards are very picky about RAM...  (You can google about this issue to see if your MB is one.) 

1 hour ago, Frank1940 said:

A few motherboards are very picky about RAM... 

AM4 motherboards are especially picky but getting better with each BIOS release.

  • Author

Two ram modules. I only bought the ram that was compatible for my motherboard. I have the memory test going now and I will let it run overnight.

 

As for the overclocking, I understand that it can make things unstable, but my last CPU/MOBO combo was overclocked and I wasn't having any issues really until the last crash before I replaced the motherboard and CPU. If a stable clock can be obtained, is there no real reason not to overclock it? Just wondering is all.

 

I will consult the Ryzen thread again, that is where I got the idea to turn off C states. I will continue to browse through it as well as do the memory test.

 

 

1 hour ago, Micaiah12 said:

is there no real reason not to overclock it?

Ryzens don't overclock by much. The 1000-series is already close to the limits of the silicon so there's not a lot to be gained. The 2000-series in particular, using Precision Boost and XFR2, do pretty well by themselves if you keep them cool enough. I really wouldn't do it.

 

Don't turn off C-states if you don't need it. Try the Power Supply Idle control in the BIOS first.

  • Author

Yes, I saw the power supply idle control down at the bottom of one thread. I will do that when I get home.

I've disabled the overclock, I suppose I can get into the overclocking project afterwards if I deiced too. It just seems when I was overclocking this CPU and my previous one I was getting much better performance in the windows VM's then stock clocks.

Given the title of this thread, you do indeed need to concentrate on stability in the first instance. Once you've achieved that then do whatever you want - at your own risk, of course.

  • Author

Correct, will update tomorrow morning with memory test update.

  • Author

Hey guys just remembered. I grabbed this screen shot earlier today before the last crash. 

55BEE3B1-5170-4EDA-99B9-ED69E491051D.jpeg

AMD-Vi and IOMMU are to do with hardware passthrough to a VM. The ATA error messages are about a SATA link failing and the disk dropping offline. The controller repeatedly tried to re-establish the link but doesn't get a response. Grab diagnostics before you shut down to check both power and data cables.

  • Author

Memory test finished. Looks like no errors on the ram side. 

 

I will try and repeat the crashing however, when it crashes I can’t do anything to access the server. How am I going to grab the logs? I remember that there was one way to continuously write the logs to the USB. How do I do that again?

  • Community Expert

Install Fix Common Problems  plugin and turn on its troubleshooting feature.  That will write the files to the   logs  folder/directory on the flash drive. 

  • Author

Ok I will do that when I get home and then I should be able to pull the logs.

Thanks for all the help guys!

  • Author

Hey guys. So apparently, troubleshoot mode doesn't exist as I am on 6.7.0.

I have attached an image of my monitor when the server crashed as well as the system log from logging.html.

It seems when I start the array and the rebuild takes place. There is a huge amount of errors to disk1 and then the server crashes. This could be inline with whenever I am trying to build out a VM and I browse for the ISO file it will crash the unraid server since it starts to read from disk 1. I don't know exactly which disk is disk1 but I am going to reboot and see if I can find that out. I really am starting to think it's a bad sata controller. Let me know what you think.

 

EDIT: found that disk1 is the HGST. It's one that is connected into the sata controller.

 

IMG_1308.HEIC

System Log.htm

Edited by Micaiah12

  • Community Expert

IF you are running the latest rc version of 6.7.0 ( not sure when in the rc series this feature was added) and  you go to     Settings   >>>   Network Services  >>>    Syslog Server    , you will find an improved version of the Troubleshooting tool that use to be in Fix Common Problems.  Use the   Help   Feature for instructions.  (If I remember correctly from when I played around with it, I set the Remote syslog server as the same server being monitored and turned on the Mirror syslog to flash.)

  • Author
2 hours ago, Frank1940 said:

IF you are running the latest rc version of 6.7.0 ( not sure when in the rc series this feature was added) and  you go to     Settings   >>>   Network Services  >>>    Syslog Server    , you will find an improved version of the Troubleshooting tool that use to be in Fix Common Problems.  Use the   Help   Feature for instructions.  (If I remember correctly from when I played around with it, I set the Remote syslog server as the same server being monitored and turned on the Mirror syslog to flash.)

I went to do that. Powered unraid on, but now looks like I lost another drive, this time it's disabled entirely. I'm afraid that if I start the array I am going to have two emulated disks and I won't have my data in parity. Wondering now what the best option is. I don't know if the constant resets got to this or if the sata controller really is messed up and has been causing issues all along. Sorry guys, really trying to get you the logs, how should I precede?

Screenshot from 2019-03-12 21-56-37.png

Edited by Micaiah12

  • Community Expert
3 hours ago, Micaiah12 said:

I'm afraid that if I start the array I am going to have two emulated disks and I won't have my data in parity.

Unraid can't emulate two disks with single parity, a new config should recover disk1, but disk4's data might be damaged, since it looks like it was in the middle of a rebuild.

  • Author

So if I use the new configuration option I can probably get the array back online but disk 4 will need to be rebuilt from the parity?

  • Community Expert

You can force a rebuild but success depends mostly on if there were any array data changes since it got disable, if you want to do it I highly recommended using a new disk, so you can keep the old one intact.

  • Author

I've attached the syslog that I mirrored to the USB drive. 

syslog

  • Community Expert

Please post the diagnostics instead: Tools -> Diagnostics, also what you want to do, rebuild disk4?

Archived

This topic is now archived and is closed to further replies.

Account

Navigation

Search

Search

Configure browser push notifications

Chrome (Android)
  1. Tap the lock icon next to the address bar.
  2. Tap Permissions → Notifications.
  3. Adjust your preference.
Chrome (Desktop)
  1. Click the padlock icon in the address bar.
  2. Select Site settings.
  3. Find Notifications and adjust your preference.