BigHairyWookie Posted March 29, 2020 Share Posted March 29, 2020 (edited) Hi there I'm only a couple of weeks into using Unraid and I've suddenly starting getting loads of instability issues, where the server suddenly freezes. No response from the gui, can't ping the server, no ssh access, console unresponsive. At first I thought this was down to a plugin, but I've now removed all plugins and I'm still getting the problem. In fact it's now gone so far that after the latest lockup I can't actually start the array, in normal or safe mode, with or without a gui. Can anyone help with some ideas on what to do? **Edit** I can start the array in maintenance mode if I boot in safe mode but don't appear to be able to mount the disks. tower-diagnostics-20200329-1128.zip Edited March 29, 2020 by BigHairyWookie Quote Link to comment
JorgeB Posted March 29, 2020 Share Posted March 29, 2020 56 minutes ago, BigHairyWookie said: I can't actually start the array, in normal or safe mode, with or without a gui. Likely there's some filesystem corruption because of all the crashing, start the array normally then type "diagnostics" on the console, they will be saved to the flash drive logs folder, upload them here. Quote Link to comment
BigHairyWookie Posted March 29, 2020 Author Share Posted March 29, 2020 (edited) 4 minutes ago, johnnie.black said: Likely there's some filesystem corruption because of all the crashing, start the array normally then type "diagnostics" on the console, they will be saved to the flash drive logs folder, upload them here. Unfortunately every time I start the array normally the whole system freezes. I can't get to any diagnostics. I've just been trying to run an xfs_repair on the array disks and this's fine on disks 2-4 but on disk 1 there was metadata to be replayed, so I did the xfs_repair with -L and the whole system froze again. So I'm starting to lean towards that disk as the issue. Just tried starting the array without disk 1 and the whole server froze again. Edited March 29, 2020 by BigHairyWookie Quote Link to comment
JorgeB Posted March 29, 2020 Share Posted March 29, 2020 Just now, BigHairyWookie said: the whole system freezes. I can't get to any diagnostics. If even the console freezes it's probably not just fs corruption, possible a more serious issue, but without diags difficult to guess. Quote Link to comment
BigHairyWookie Posted March 29, 2020 Author Share Posted March 29, 2020 3 minutes ago, johnnie.black said: If even the console freezes it's probably not just fs corruption, possible a more serious issue, but without diags difficult to guess. Ok, I get that. I really don't know where to start now, short of ripping it all apart and starting again Quote Link to comment
testdasi Posted March 29, 2020 Share Posted March 29, 2020 43 minutes ago, BigHairyWookie said: Ok, I get that. I really don't know where to start now, short of ripping it all apart and starting again Start with your PSU and power connectors. When the array starts, there is a current draw to the drives, which might cause a voltage drop on your CPU / motherboard / RAM, which leads to a freeze. Try disconnect and reconnect all the power connectors to the drives. Also, how old is your PSU? You have a relatively old system so if your power supply is just enough, its aging may drop it below requirement. Next thing could be your SATA card, which is PCI. A quick search of your controller sil3114 pointed to posts back in 2009 about freezing issues with it. So if you have a PCIe SATA card and/or able to do a config without the PCI card then that may provide some clues. Quote Link to comment
BigHairyWookie Posted March 29, 2020 Author Share Posted March 29, 2020 Ok, I've managed to get the disks mounted and the array started. I basically shutdown the server, removed the flash disk, created a copy, re-imaged the flash drive and copied the Configs folder back over. One drive has come back as Unmountable: no file system, but it's a drive with no data on, so I'm not fussed about that. It's currently started and doing a parity check. I think I might wait for that to complete before doing anything else. @testdasi I replaced the power supply a few weeks ago with a brand new 650w. There was a 430w in there before which I realised wasn't going to cut it with all the new drives. Interesting point about the SATA PCI card, I might try removing that. I currently have both parity drives and one array drive on it, so I'll guess I'll need to be careful how I move stuff around. Quote Link to comment
JorgeB Posted March 29, 2020 Share Posted March 29, 2020 2 minutes ago, BigHairyWookie said: It's currently started and doing a parity check. I think I might wait for that to complete before doing anything else. That's a good idea, you also might want to enable the syslog mirror/server feature to see if it catches something if it crashes again. 3 minutes ago, BigHairyWookie said: One drive has come back as Unmountable: no file system, but it's a drive with no data on, so I'm not fussed about that. That's likely fixable with a filesystem check: https://wiki.unraid.net/Check_Disk_Filesystems#Checking_and_fixing_drives_in_the_webGui Quote Link to comment
BigHairyWookie Posted April 1, 2020 Author Share Posted April 1, 2020 Thought I'd post a quick update. I successfully ran the parity check, removed the SATA card and some of the disks. It gave me the opportunity to actually use one of the drives as an on-site backup, which is probably better for my sanity. I've also added the syslog mirror to the flash drive, just in case. Just need to remember to move this off the flash at some point. So far we've been running for nearly 21 hours with no errors. I'll slowly start to add the plugins back in now. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.