alexrok Posted February 28, 2021 Share Posted February 28, 2021 (edited) Hi All, So I am experiencing an intermittent issue where my unraid server randomly shuts down. Usually it shuts down right after the array is started and mounted, though once it ran for a few hours before shutting down. The system was stable previously and I don’t remember having stability issues directly after any recent changes, rather I do recall having issues randomly appearing which I originally attributed to the system getting turned off by family trying to cut the power bill back. My rig is totally custom with the specs below: MB: ASRock B450M Pro4 Version - s/n: M80-CA011103432 BIOS: American Megatrends Inc. Version P3.50. Dated: 07/18/2019 (last version good for 2000 series) CPU: AMD Ryzen 7 2700 Eight-Core @ 3200 MHz GPU: GeForce GT710 2GB RAM: 16GB Crucial DDR4 PCIE Sata Controller: ASM1062 Serial ATA Controller Drives: 3x WD 10TB Shucked Drives 1x HGST Deskstar NAS 4tb 2x 240gb Kingston SSD PSU: Cooler Master MWE 650W 80+ Gold I have tried totally reinstalling the bios to the latest version compatiable with my CPU, as well as turning off c states and setting idle current control to typical in the bios. I've also tried multiple known good ddr4 ram kits, never at the same time. Everything else is left at default. When my server comes up it automatically mounts the disks, and after mounting it nearly always instantly hard crashes and shuts down. I am able to boot into the dashboard, and it will run no issue with the array not started. Additionally, I can also get it in maintenance mode no problem. But if the array starts it goes right down. I have disabled docker as well as VMs in an effort to rule these out. Additionally, booting in safe mode yields the same results. The last thing I see before it dies is attached as an image. I’m at my wits end. I really don’t think it is a hardware issue but rather software that I can't pinpoint. What really bothers me is I really wasn't doing much before this started happening. Normal use with no config changes. Edited February 28, 2021 by alexrok Quote Link to comment
JorgeB Posted March 1, 2021 Share Posted March 1, 2021 If it server shuts down it's likely a hardware probable, like a overheating CPU or a PSU problem. Quote Link to comment
PeteAron Posted March 1, 2021 Share Posted March 1, 2021 Your power supply may be failing? if you have a backup power system maybe you can get a current/wattage reading while booting. I dont know how much that cpu uses, but a 650 W PS should be good for the number of drives you have, at 7200 rpm, but maybe with not too much room to spare during bootup. Quote Link to comment
alexrok Posted March 1, 2021 Author Share Posted March 1, 2021 11 hours ago, JorgeB said: If it server shuts down it's likely a hardware probable, like a overheating CPU or a PSU problem. CPU temps are fine, even under stress tests like prime95 did not exceed 70c by much Quote Link to comment
alexrok Posted March 1, 2021 Author Share Posted March 1, 2021 3 minutes ago, kimifelipe said: Your power supply may be failing? if you have a backup power system maybe you can get a current/wattage reading while booting. I dont know how much that cpu uses, but a 650 W PS should be good for the number of drives you have, at 7200 rpm, but maybe with not too much room to spare during bootup. I'd be really surprised if it's the power supply. it's a 650w 80+ gold and was bought new for the server. The cpu has a 65w TDP and the entire system usually pulled around 80w in normal use, up to 180w when used intensively. It boots up fine as well, and I can get into the online interface if I don't run parity check it runs fine. I have one other PSU in my desktop system, but I do not have an appropriate extra sata connector to power all the drives (it is a SF600 from corsair, different pinouts and used in an itx system so the extra sata power cables are long gone). I might investigate buying a PSU from best buy or some place with a forgiving return policy, but again, I feel like it is a long shot as I ran a stress test with no real problem. during the parity check I monitored the system and it does not pull more than 120w from the wall from my watt meter which to me isn't exactly high. I really will be surprised if this is hardware, though you never know I suppose. Quote Link to comment
PeteAron Posted March 1, 2021 Share Posted March 1, 2021 Sure, i am just guessing. but it sounds like your problem occurs at high power draw. Your 120W reading is not your max power value. My array uses ~ 170W with all drives up, for example during a parity check. But power draw during bootup is higher than 200W, its a very short burst and never the same value. You also have a video card which i dont have - what is it's power draw? I'd add all this up and make sure 650W is enough for you - there are lots of resources on the net to help calculate PS needs. It's also important to have a single 12v rail in your PS for your drives - i dont know the specs of your PS, i would check that too. While this sounds like a power issue to me i obviously have no idea. it's a possibility i suggest you track down, among any others suggested. doubting its the problem is different from knowing that it cant be the problem, if you know what i mean. Quote Link to comment
alexrok Posted March 1, 2021 Author Share Posted March 1, 2021 3 hours ago, kimifelipe said: Sure, i am just guessing. but it sounds like your problem occurs at high power draw. Your 120W reading is not your max power value. My array uses ~ 170W with all drives up, for example during a parity check. But power draw during bootup is higher than 200W, its a very short burst and never the same value. You also have a video card which i dont have - what is it's power draw? I'd add all this up and make sure 650W is enough for you - there are lots of resources on the net to help calculate PS needs. It's also important to have a single 12v rail in your PS for your drives - i dont know the specs of your PS, i would check that too. While this sounds like a power issue to me i obviously have no idea. it's a possibility i suggest you track down, among any others suggested. doubting its the problem is different from knowing that it cant be the problem, if you know what i mean. I have ordered a new PSU just for the sake of not having to wonder. I'm hoping that it is the case as it would be a lot easier to chase than other issues. I did confirm there is a single 12v rail, and I did take the time to do all the calculations. Load wattage would be 225W on this machine, with an expected 10.5A on the 3.3v rail, 10.4A on the 5v rail and 15.6a on the 12v rail. This is well within the spec of the unit and consistent with my measurements as I wouldn't expect to see load wattage when booting. Maybe a spike, but my measurements are pretty close to the theoretical max so I feel good about them. I might guess that the differences in our highest observed power draw are related to the fact that I have 1/3 the hard drives that you do along with a lot of different components. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.