Power Supply failed, and now getting random unclean shutdowns following install of replacement power supply

GalaxyBird · December 1, 2022

Hello!

My unraid set up shut down some time last spring, and wouldn't boot up again. I assumed (and hoped) it was an issue with the power supply. This past weekend, I finally had time to swap it out (550w up to 750w). One small heart attack with the root password later, I as back up and running.

I started up the array and initiated a parity check which passed without any issues. My Fix Common Problems plug in was also out of date, so I updated that, and still have a number of items to work through there, but I've had three instances of the system shutting down for no reason that I can tell.

I'm in a pretty cool basement, and temperatures of the discs seem to not be an issue at all.

I took a look at this post - and this issue has happened when I don't have a terminal session open

and implemented the recommendations around shutdown timing there, but the issue is still occurring. (I also don't have VMs running, so I assume that timing change doesn't apply to my situation but I did it anyway.

I set up log mirroring and have attached a log file that goes over one of the shutdowns. I tried to look through the logs myself, but I'm not familiar enough yet with the output to draw conclusions. Something that stood out, is that this item occurs 18,195 times

```

Nov 30 10:20:17 Tower emhttpd: error: mdcmd, 2723: Input/output error (5): write
Nov 30 10:20:17 Tower kernel: mdcmd (18193): spindown 4
Nov 30 10:20:17 Tower kernel: md: do_drive_cmd: disk4: ATA_OP e0 ioctl error: -5
Nov 30 10:20:18 Tower emhttpd: error: mdcmd, 2723: Input/output error (5): write
Nov 30 10:20:18 Tower kernel: mdcmd (18194): spindown 4
Nov 30 10:20:18 Tower kernel: md: do_drive_cmd: disk4: ATA_OP e0 ioctl error: -5
Nov 30 10:20:19 Tower emhttpd: error: mdcmd, 2723: Input/output error (5): write
Nov 30 10:20:19 Tower kernel: mdcmd (18195): spindown 4
Nov 30 10:20:19 Tower kernel: md: do_drive_cmd: disk4: ATA_OP e0 ioctl error: -5

```

before the logs went quiet for ~24 hours, so I think this led up to one of the restarts (not certain about that though). If I'm reading correctly, it might be saying there is an issue with disc 4 of the array? That disc says its passed the SMART health status, but maybe it's still unhealthy

I've got logging going, so I'm going to try and keep a closer watch on the time the next shutdown occurs, and upload a log file that's definitely leading up to the restart.

Thanks for any input and help!

syslog.log

JorgeB · December 2, 2022

That error could be a SAS disk not spinning down, please post the diagnostics.

GalaxyBird · December 2, 2022

I've attached the diagnostic zip. Thank you!

tower-diagnostics-20221202-0723.zip

JorgeB · December 2, 2022

You have SAS disks, either disable spin dow or install the SAS spin down plugin.

As for the unclean shutdowns, not clear to me if you see an unclean shutdown after you shutdown or reboot the server or if the server is crashing and rebooting on its own.

GalaxyBird · December 2, 2022

Thanks again for the reply!

With respect to your comment on "disable spin down" - I interpret this to mean going to -> http://tower/Settings/DiskSettings and setting Default spin down delay: to Never - is that what you mean?

I'm reading up on the plug in here, and that seems like a better option.

Power Supply failed, and now getting random unclean shutdowns following install of replacement power supply

Recommended Posts

GalaxyBird

Link to comment

JorgeB

Link to comment

GalaxyBird

Link to comment

JorgeB

Link to comment

GalaxyBird

Link to comment

Join the conversation