Jump to content

A lot of weird problems after several power outages


dgtlman

Recommended Posts

Two weeks ago, our house had several power outages. We have a whole house generator that powers the house automatically 90 seconds after the power goes out. To help bridge the gap, I have a UPS connected to my Unraid box. In addition, I have the Unraid install set up to reboot on power loss and automatically mount the array as soon as it is powered up.

 

We were out of town for a week and while so, the power went out four times. Which shouldn't have affected Unraid at all. Unfortunately, the battery on the UPS went bad and when the power went out, the Unraid box immediately turned off. (I have since replaced the battery to prevent this from happening again).

 

Each time it came back online it triggered a parity check and would pass. Except, during the third check it, the power went out again in the middle of the check. This has caused some weird anomalies. For example, I can't write to the plex movies folder. I have run the permission checker (from tools) and it doesn't change things. Also, even though the array is started, the web GUI says it is stopped, which causes dockers to act erratically (including the cloudcommander which I use to manage files and folders).

 

Fortunately, there are two lucky situations. First, in the past few weeks, Best Buy had 14TB easystore drives on sale. So I had a few of them that I hadn't added into the Unraid yet. Second, midnight commander still works and I have ssh'ed in and backing up everything to the easystore, drives. It is taking a while, but the backup will allow me some breathing room before trying to fix this.

 

*exhale*

 

So, I am strongly leaning towards just rebuilding the Unraid box from scratch after backing it all up. This should ensure that all weirdness and problems are gone. There seems to be too many really big issues to put my finger on, and wipe everything and reload. It will take time to do the data shuffle, but in the long run it may actually be the fastest and best option. Any suggestions or recommendations about this would be appreciated.

 

Second, I want to restore everything the easiest and best way possible. Most of the dockers are pretty straightforward. The one I am not looking forward to is Plex. I have spent a disgusting amount of time customizing the libraries with art and collections. I don't want to lose all the work. I wish there was a clean way to do a database backup and restore. Since there is not, does anyone have a suggestion or tutorial on how to do it without losing all the work I have done on the library?

 

I have been using using Unraid as my primary NAS for 4 years now without a problem. I am assuming that this is just an anomaly caused by the frequent power cycles that caused the array to stop uncleanly too many times. But then again, that is just an educated guess.

 

Any advice, tips, assistance.... or even smacks against the back of the head (kidding) would be greatly appreciated.

iron-diagnostics-20211130-1547.zip

Edited by dgtlman
added diagnostics
Link to comment
1 minute ago, dgtlman said:

Except, during the third check it, the power went out again in the middle of the check. This has caused some weird anomalies. For example, I can't write to the plex movies folder. I have run the permission checker (from tools) and it doesn't change things. Also, even though the array is started, the web GUI says it is stopped, which causes dockers to act erratically (including the cloudcommander which I use to manage files and folders).

Post your diagnostics

 

Link to comment

Can you restart the server and then post the diagnostics again.  Your existing logs are 100% and basically have stopped.  Primarily I would think due to

Nov 26 22:58:39 iron inetd[12345]: accept (for ftp): Invalid argument
Nov 26 22:58:39 iron inetd[12345]: accept (for telnet): Invalid argument
Nov 26 22:58:39 iron inetd[12345]: accept (for ftp): Invalid argument

Which seem to start before the server is even started up.

Link to comment

The problem with the logs filling up has only happened in the past few days. At this point  I have a single directory that needs to be finished backing up. This should be done in about 24 hours. At that point I will restart and post the new diagnostics. I can also be a little more aggressive about trying things knowing that all my data is safely backed up. 

 

Thanks for your help. 

Link to comment
59 minutes ago, trurl said:

You should fix your go file, it is starting emhttpd twice.

 

I commented out the second instance. I am assuming that this would have anything to do with not being able to write to directories though, correct?

Edited by dgtlman
clarification
Link to comment

This morning I woke up and checked the status on the backup (using MC to usb drive). It had a little over an hour left. After a moment. I reliazed that it actually had stopped and the connection via ssh had been dropped.

 

After an hour of being patient and troubleshooting, I realized that the entire thing must have locked up as it was not responding via ip or attaching a monitor. I even verified the switch logs, and the port was no longer getting any status back from the ethernet card. So, not that I wanted to, but I did an unclean shut down as that was the only way to get back in. 

 

On boot up, it looks like the issue of the array not staying corrected has been resolved. I am assuming that this was a mismatch state caused by two instances of the emhttpd running. Now that the second instance is commented out. The GUI seems to be running fine, although it reverted back to the default identification name (tower). This is not the first time this has happened and I am not sure why it keeps doing that. 

 

Now I need to either determine and resolve what is causing the directory not allowing things to write to it... or re-setup the entire Unraid box from a clean install. 

tower-diagnostics-20211201-1128.zip

Edited by dgtlman
Link to comment
35 minutes ago, dgtlman said:

the default identification name (tower)

According to your diagnostics (config/ident.cfg) that is what it should be. Don't see how having an extra emhttpd start in go could be a solution to that.

 

Any settings going to default might be a sign that the .cfg couldn't be read on flash for some reason.

 

40 minutes ago, dgtlman said:

re-setup the entire Unraid box from a clean install. 

Each boot is already a "clean install". The OS is completely contained in archives on the flash drive. Those archives are unpacked, exactly as they were when first created, into RAM at each boot, and the OS runs in RAM. Think of it as firmware but easier to work with.

 

48 minutes ago, dgtlman said:

unclean shut down as that was the only way to get back in

syslog is in RAM, just like the rest of the OS, and it is included in diagnostics, but has nothing in it from before reboot.

 

Setup syslog server so syslog can be saved somewhere.

Link to comment

I suppose I was unclear. By re-set up, I mean I need to do a complete wipe of the drives and set it up as if it was a brand new configuration. I can't even backup my appdata using the utility. I have to manually do it from MC. 

 

What is the best way to restore appdata after manually backing it up?

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...