Jump to content

unraid crashing right after restart...unclear as to why


Recommended Posts

bit of a puzzler this, my system has been plagued by random crashes and various problems since i built it 4 years ago, from my understanding and previous posts its due to both my lack of proper setup or simply that im running on ryzen. that aside my system had been up for about a month with no issues till last week when it crashed, and unfortunately i am working out of town right now and cant access the hardware to check anything other than having a buddy hit the restart button and send a picture of the console display on my attached monitor when he gets around to it, but since that one crash it appears to crash immediately after rebooting and letting docker start a few containers, sometimes its up for a good hour. each crash seems a different cause but ive seen this one 3 times now (see attached picture), can anyone help me with this or atleast let me know if this is a hardware failure?

IMG_20220316_214743_01_20220316_214829.jpg

Link to comment

alright ill give it a try this weekend when i get back home, ive already done everything i could in the ryzen faq about setting it up and it didn't help except to make the crashes not quite as frequent, as well as updating to 9.10.0rc1, though i might also try updating to rc3 as well to try a newer kernel

Link to comment

ok so i got around to trying to do a memtest, for some reason it wont let me boot into legacy mode, my motherboard just says the media is unbootable but it works just fine in uefi mode. did i forget to do something? i did however notice that when i upgraded my ram (from 32gb to 64gb) i guess i forgot to check whether the board set the speed correctly and it was trying to run 3200mhz ram at 2400mhz, as far as im aware that shouldn't be a problem as it wasn't overclocked just that id be loosing a bit of performance but please correct me if im wrong or if that could have been the problem

Link to comment
  • 2 weeks later...
  • 1 month later...

for anyone who is interested in this still, i figured out the problem. it turns out my power supply was either failing or i was drawing too low power from it for it to be able to do its job efficiently and supply stable power. i was using a corsair sfx-750, on a system that draws a max of like 250W (unable to determine the exact usage as my only number is from my ups that has my unraid server, anther system running pfsense, security cameras and 3 routers on it, and in unraid my ups reports an average usage of around 255w, with the highest i saw it at was about 300w) so i installed a smaller (again corsair as it was the only one i could get, and it is also a platinum 80+, and all the other sfx power supplies i could find were only bronze or silver) sfx-400 and the system has been up for 3 weeks, only turning off when i manually reboot or shut it down to do other things and haven't had any problems since swapping the power supply.

 

if anyone else cares to know the science behind what i found out or if it could be useful to someone here it is, power supplies have efficiency curves as well as stability curves. the power supply is *most* efficient and stable at about 40-55% load, higher than that you only loose a few efficiency percent (on my particular one its 94% efficient at 45% load, its peak, and only drops to 92% efficient at 100%load), but efficiency isn't what im talking about here, its the stability curve. voltage regulators are a bit of silicon that take an unstable voltage and, wait for it, regulate it to a constant voltage that is generally very stable and has little or no fluctuations, it does this by effectively working like a light switch turning the power on and off very fast. they are very good at it when they are within their specified stability range, meaning there must be at least X amount of amperage being drawn for it to make stable, clean volts. if it gets too high of a draw it over heats and may fail, or it causes power "flickering" because the rate it turns on/off slows down to account for the current draw so you end up with a burst of high current and voltage, then it turns off and repeats till the load goes down or the reg. burns up. however, if there isn't *enough* load on a regulator i can also create voltage fluctuations, now this will be far less harmful as were talking as low as 0.1--0.2v or even lower depending on the rail voltage its on. but to a cpu or memory module a 0.2v difference where its voltage should be, say 3.3v, one second it could get 3.7v, then the next 3.1v and memory modules do not like that much voltage difference, they can accept that difference one decimal over (3.32v-3.28v for ex.). and if your power supply is like how mine was where i had a 750w power supply serving a 250w peak load, that's about a 30%load, witch in theory should be ok and if the server ran at peak all the time it probably would be, but i suspect it often dipped down to around 200w or lower witch would be in the 20%'s, and i believe most power supplies stability curve starts being stable enough for computers to be, well stable, at around 30-35% ish. now that I've lowered my power supply rating (by going to the 400w) ive increased my max load to just over 60% of the power supply instead of 30%, that should mean at the lowest my system runs i should still be at about the 40% load range. thats enough of that lecture on power supplies, if anyone wants more on this Ltt did a great video on it a while ago link here

 

tl;dr: power supplies cause problems too, don't throw the biggest power supply available in your system.

 

if any info above is incorrect (please don't tell me I'm totally wrong, I'm not i do have background with building electronics, not Shure why i didn't think about my power supply till now though) or if anyone has more data than i shared please correct me i like learning new things

 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...