Crashing unexpectedly


Psyonus

Recommended Posts

I am hoping someone out there might have a theory for this:
I have been running my unraid server since 2016, in that time I never had a single crash (non responsive/ locked up) it has been a rock as far as reliability.

I use it mainly for storage with a plex docker hosting my content. I have a couple of game server dockers and a windows VM (no gpu passthrough) that I remotely connect to from time to time.
I had been using a router supplied from my ISP however I installed pfsense on it's own hardware separate from the unraid server 2 weeks ago. 
initially I set it all up and made a static ip for my server as I had done with the previous router (the use different ip ranges). I tested my plex using 4g on my phone and bam 1st crash... I restarted my unraid server and tested again and again it crashed in exactly the same way. To be clear it crashed when I tried to access plex remotely but not when I accessed it locally.
I decided to have my server accept a dhcp address and it then worked, I was able to connect to plex without issues.

To reiterate my config I have a plex docker, 2 game dockers and a Win10 vm.
The other day I was playing a game of space engineers hosted on my Win10 VM same as I had done for years I had no new containers no new vm just the same and it appeared to crash out of nowhere.
I was a bit frustrated at this point but persisted then this morning in prep to move things out of the closet to install a rack and run some cables I shut down my dockers and vm and it suddenly crashed again....I am going to revert to my old router and see if any crashes continue.
I run my server off a ups (decent one) so power was never an issue.

I did have the server booted in GUI mode the last time it crashed and it was completely unresponsive and frozen on the dashboard.
if any ideas suddenly come to you I would love to hear it. I really want to use pfsense and it was nice not having my router open for my ISP to just look in whenever they want.

More concerning is the sudden instability, does anyone know of a conflict with Pfsense and unraid?

server-a-tron-diagnostics-20200118-0246.zip

Link to comment

Hi testdasi,

 

OK I will detail as best I can.

 

Crash 1: I had newly installed my pfsense router and I was interfacing with unraid using the wedgui local IP. I noticed the gui could no longer access unraid. I used putty to SSH in and it could not establish a connection. I connected a monitor, keyboard and mouse to the server and i could no input. I had to manually power off.

 

Crash 2: exactly the same as crash 1.

 

Crash 3: I was playing a game, it was hosted on my win10 VM, my game suddenly disconnected and I could not access unraid, when I went to investigate the server was powered off.

 

Crash 4: I was closing my VM and dockers to shutdown my server and while closing them it froze, it was loaded with the gui option and had a keyboard, mouse and monitor attached but it was unresponsive and showed a static image of the dashboard.

 

I hope this is what you need, thanks for your help!

Link to comment

1st thing to do: turn on syslog server and mirror to your flash. Then try to reproduce the crash and once rebooted, attach the syslog. It hopefully has some clues as to what happened during the crash.

(your diagnostics attached after a reboot doesn't have the syslog pre-crash).

 

Then run memtest for 24 hours.

 

Finally see if you can swap the PSU. Crash 3 suggests perhaps a problem with the PSU since it doesn't usually power itself off automatically.

Link to comment

I moved the server to a new chassis approximately 1 month ago and fully cleaned it out.

I bought a 1200w FSP psu but when I powered the server on it would not detect all the drives.

I reinstalled my 1000w FPS psu and all is working again.

The reason I bought the 1200w psu was due to the 1000w one being over 3 years old and wanted to replace it before there was an issue.

Do you have any recommendations for a specific brand/model that has a reliable track record?

 

On the off chance, I will explain what I experienced with the 1200w psu as I am at a loss as to why this is happening.

 

I installed the 1200w psu an FSP AURUM PT SERIES Model number PT-1200FM. The shop I bought it from have me a replacement unit (slightly newer than the one I had) and it got exactly the same result.

I am happy to buy a new psu to try but dropped humdreds of dollars on another psu to have it do the same thing is not ideal :)

Could it be that my 1000w being older is struggling now where it was not before?

 

My server

 

Motherbaord: x8dtl-3f

CPU: 2x Xeon X5670 

Memory: 48GB DDR3

14x Mechanical drives (1 hot spare)

2 SSD Cache

GTX 1660 Super (transcoding for plex only)

USB 3 pci-e card

SATA Host Controller: 88SE9235 (https://www.marvell.com/storage/system-solutions/assets/Marvell-88SE92xx-002-product-brief.pdf)

 

From what I can find I see the power needs are well within the 1000w capabilities.

 

If there is anything else I can add to help diagnose this I will do my best, I have not had a great deal of experience in this area so I am a bit stumped.

 

 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.