Jump to content
Sign in to follow this  
doesntremindm3

unRAID unresponsive after parity sync, panicking a little

10 posts in this topic Last Reply

Recommended Posts

I lost a data disk yesterday and replaced it with a disk larger than the current parity disk and let that to sync over night.  I woke up this morning and it was progressing.  94%...95%....96% so I decided to give it more time.  An hour later, the webGUI won't respond, but the server pings and I can access it via PuTTY.  Hard drive lights show NO activity.  I have access to the unenu plugin webGUI.  I was able to get the following from there:

 image.thumb.png.fdfa7b10e5115918461f785c706c6ad1.png

image.thumb.png.6f3657c9d56c82eaeb3f9b8ff7b5498b.png

What would you advise I do next?

Share this post


Link to post
7 minutes ago, doesntremindm3 said:

replaced it with a disk larger than the current parity disk and let that to sync over night.

Do you mean you were using the swap disable procedure? Looks like that is what the screenshot is saying.

 

What version of unRAID is this? I barely remember unMenu. And not many people on the forum ever heard of it. unRAID v4.7 is when I came in. Problem with getting so out-of-date is nobody around to help anymore. Recommend upgrading to V6 once you get things square. Even if you don't need any of the fancy new stuff, it is a much better NAS.

 

I don't know if the wait is normal for your version or not. I do know that swap disable first copies parity to the new larger disk with the array offline before it will rebuild the failed data disk to the old parity but once the copy is complete it should just be a normal rebuild I would think.

 

I know some of those older versions used to OOM (out-of-memory) on occasion and kill the web interface. Do you have any idea how to troubleshoot things at the command line? Can you tell if emhttp process is still running, for example?

Share this post


Link to post

Yes, the swap disable procedure from here: https://wiki.unraid.net/The_parity_swap_procedure for v5.

It's v5.  I can't get to any specifics without the webGUI.  Absolutely looking at a migration plan after this.

There's some versions with a wait?  Like without any activity?  My hard drive lights are still showing no activity.

Can you help me determine if emhttp is running?  I'm not great at the command line.  Running "ps" from PuTTY just shows bash and ps running.  That doesn't sound right.  This information below was from PS Info in unmenu and I think it's what I need.  It says that it is running, right?

(from /usr/bin/ps -eaf)

UID PID PPID C STIME TTY TIME CMD
root 1 0 0 Sep10 ? 00:00:07 init
root 2 0 0 Sep10 ? 00:00:00 [kthreadd]
root 3 2 0 Sep10 ? 00:00:14 [ksoftirqd/0]
root 5 2 0 Sep10 ? 00:00:00 [kworker/0:0H]
root 7 2 0 Sep10 ? 00:00:00 [kworker/u:0H]
root 8 2 0 Sep10 ? 00:00:00 [migration/0]
root 9 2 0 Sep10 ? 00:00:00 [rcu_bh]
root 10 2 0 Sep10 ? 00:00:25 [rcu_sched]
root 11 2 0 Sep10 ? 00:00:00 [migration/1]
root 12 2 0 Sep10 ? 00:00:14 [ksoftirqd/1]
root 14 2 0 Sep10 ? 00:00:00 [kworker/1:0H]
root 15 2 0 Sep10 ? 00:00:01 [migration/2]
root 16 2 0 Sep10 ? 00:00:11 [ksoftirqd/2]
root 18 2 0 Sep10 ? 00:00:00 [kworker/2:0H]
root 19 2 0 Sep10 ? 00:00:00 [migration/3]
root 20 2 0 Sep10 ? 00:00:12 [ksoftirqd/3]
root 22 2 0 Sep10 ? 00:00:00 [kworker/3:0H]
root 23 2 0 Sep10 ? 00:00:00 [khelper]
root 177 2 0 Sep10 ? 00:00:00 [bdi-default]
root 179 2 0 Sep10 ? 00:00:00 [kblockd]
root 299 2 0 Sep10 ? 00:00:00 [ata_sff]
root 309 2 0 Sep10 ? 00:00:00 [khubd]
root 419 2 0 Sep10 ? 00:00:00 [rpciod]
root 439 2 5 Sep10 ? 00:47:35 [kswapd0]
root 500 2 0 Sep10 ? 00:00:00 [fsnotify_mark]
root 520 2 0 Sep10 ? 00:00:00 [nfsiod]
root 524 2 0 Sep10 ? 00:00:00 [cifsiod]
root 539 2 0 Sep10 ? 00:00:00 [crypto]
root 712 2 0 Sep10 ? 00:00:00 [deferwq]
root 716 2 0 Sep10 ? 00:00:00 [scsi_eh_0]
root 717 2 0 Sep10 ? 00:00:00 [usb-storage]
root 755 1 0 Sep10 ? 00:00:00 /sbin/udevd --daemon
root 845 2 0 Sep10 ? 00:00:00 [scsi_eh_1]
root 846 2 0 Sep10 ? 00:00:00 [scsi_eh_2]
root 858 2 0 Sep10 ? 00:00:00 [scsi_eh_3]
root 859 2 0 Sep10 ? 00:00:00 [scsi_eh_4]
root 860 2 0 Sep10 ? 00:00:00 [scsi_eh_5]
root 863 2 0 Sep10 ? 00:00:00 [scsi_eh_6]
root 866 2 0 Sep10 ? 00:00:00 [scsi_eh_7]
root 867 2 0 Sep10 ? 00:00:00 [scsi_eh_8]
root 882 2 0 Sep10 ? 00:00:00 [scsi_eh_9]
root 884 2 0 Sep10 ? 00:00:00 [scsi_eh_10]
root 886 2 0 Sep10 ? 00:00:00 [kworker/u:5]
root 892 2 0 Sep10 ? 00:00:00 [scsi_eh_11]
root 893 2 0 Sep10 ? 00:00:00 [kworker/u:8]
root 929 2 0 Sep10 ? 00:00:03 [kworker/0:1H]
root 930 2 0 Sep10 ? 00:00:03 [kworker/3:1H]
root 931 2 0 Sep10 ? 00:00:03 [kworker/1:1H]
root 932 2 0 Sep10 ? 00:00:03 [kworker/2:1H]
root 993 2 0 Sep10 ? 00:00:00 [scsi_wq_2]
root 1065 1 0 Sep10 ? 00:00:00 /usr/sbin/syslogd -m0
root 1069 1 0 Sep10 ? 00:00:00 /usr/sbin/klogd -c 3 -x
root 1099 1 0 Sep10 ? 00:00:00 /sbin/dhcpcd -t 10 -h storage -L eth0
bin 1124 1 0 Sep10 ? 00:00:00 /sbin/rpc.portmap
root 1133 1 0 Sep10 ? 00:00:00 /sbin/rpc.statd
root 1147 1 0 Sep10 ? 00:00:00 /usr/sbin/inetd
root 1161 1 0 Sep10 ? 00:00:00 /usr/sbin/acpid
81 1171 1 0 Sep10 ? 00:00:00 /usr/bin/dbus-daemon --system
root 1176 1 0 Sep10 ? 00:00:00 /usr/sbin/crond -l notice
daemon 1178 1 0 Sep10 ? 00:00:00 /usr/sbin/atd -b 15 -l 1
root 1820 1 26 Sep10 ? 04:06:07 /usr/local/sbin/emhttp
root 3763 1 0 Sep10 ? 00:00:00 /bin/bash /boot/unmenu/uu
root 3764 1 0 Sep10 ? 00:00:00 logger -tunmenu -plocal7.info -is
root 3767 1 0 Sep10 tty1 00:00:00 /sbin/agetty 38400 tty1 linux
root 3768 1 0 Sep10 tty2 00:00:00 /sbin/agetty 38400 tty2 linux
root 3769 1 0 Sep10 tty3 00:00:00 /sbin/agetty 38400 tty3 linux
root 3770 1 0 Sep10 tty4 00:00:00 /sbin/agetty 38400 tty4 linux
root 3771 1 0 Sep10 tty5 00:00:00 /sbin/agetty 38400 tty5 linux
root 3772 1 0 Sep10 tty6 00:00:00 /sbin/agetty 38400 tty6 linux
root 7680 2 0 Sep10 ? 00:00:54 [kworker/2:2]
root 13624 2 0 06:57 ? 00:00:03 [kworker/0:0]
root 13673 2 0 06:57 ? 00:00:06 [kworker/1:0]
root 13874 755 0 Sep10 ? 00:00:00 /sbin/udevd --daemon
root 13917 1 0 Sep10 ? 00:00:00 /usr/sbin/nmbd -D
root 13919 1 0 Sep10 ? 00:00:00 /usr/sbin/smbd -D
root 13924 13919 0 Sep10 ? 00:00:00 /usr/sbin/smbd -D
avahi 13932 1 0 Sep10 ? 00:00:00 avahi-daemon: running [storage.local]
avahi 13933 13932 0 Sep10 ? 00:00:00 avahi-daemon: chroot helper
root 13941 1 0 Sep10 ? 00:00:00 /usr/sbin/avahi-dnsconfd -D
root 15974 2 0 Sep10 ? 00:00:00 [md]
root 15975 2 0 Sep10 ? 00:00:00 [mdrecoveryd]
root 15977 2 0 Sep10 ? 00:00:00 [spinupd]
root 15978 2 0 Sep10 ? 00:00:00 [spinupd]
root 15979 2 0 Sep10 ? 00:00:00 [spinupd]
root 15980 2 0 Sep10 ? 00:00:00 [spinupd]
root 15981 2 0 Sep10 ? 00:00:00 [spinupd]
root 15982 2 0 Sep10 ? 00:00:00 [spinupd]
root 15983 2 0 Sep10 ? 00:00:00 [spinupd]
root 15984 2 0 Sep10 ? 00:00:00 [spinupd]
root 15985 2 0 Sep10 ? 00:00:00 [spinupd]
root 15986 2 0 Sep10 ? 00:00:00 [spinupd]
root 16486 2 0 07:12 ? 00:00:01 [kworker/3:0]
root 16586 2 0 07:12 ? 00:00:00 [kworker/0:1]
root 16599 2 0 07:12 ? 00:00:00 [kworker/2:1]
root 17525 2 0 07:23 ? 00:00:00 [kworker/1:1]
root 17837 1 0 08:05 ? 00:00:00 /usr/sbin/ntpd -g -p /var/run/ntpd.pid
root 18106 1147 0 09:10 ? 00:00:00 in.telnetd: 192.168.1.44
root 18107 18106 0 09:10 pts/0 00:00:00 -bash
root 18532 755 0 09:48 ? 00:00:00 /sbin/udevd --daemon
root 20092 3763 0 10:18 ? 00:00:00 awk -W re-interval -f ./unmenu.awk
root 20694 20092 0 10:57 ? 00:00:00 gawk -v ConfigFile unmenu.conf -v MyHost storage -v ScriptDirectory /boot/unmenu -v AWK_PID -v LocalConfigFile unmenu_local.conf -v MyPort 8080 -W re-interval -f /boot/unmenu/29-unmenu-sysinfo.awk SWAP_DSBL GET /sys_info?option=Ps+info |Main|array_management|Array Management|disk_management|Disk Management|system_log|Syslog|myMain|myMain|links|Useful Links|disk_performance|Disk Performance|network_performance|Network Performance|usage|Disk Usage|smarthistory|Smart History|dupe_files|Dupe Files|sys_info|System Info|file_browser|File Browser|share_iso|Share ISO|user_scripts|User Scripts|config_view_edit|Config View/Edit|pkg_manager|Pkg Manager|unraid_main|unRAID Main|about|About|help|Help|
root 20696 20694 0 10:57 ? 00:00:00 sh -c /usr/bin/ps -eaf 2>&1
root 20697 20696 0 10:57 ? 00:00:00 /usr/bin/ps -eaf
root 20889 2 0 Sep10 ? 00:00:33 [kworker/3:1]

Share this post


Link to post

After the parity copy finishes, and I'm not sure yours did or it got stuck at 99%, I'm pretty sure it shows 100% on the log on v6, not absolutely sure on v5, the array should stop and you'd need to start it again to do the disk rebuild.

Share this post


Link to post

I don't know how to do that without the webGUI.  Restarting might get me to that point, but I'm concerned what it would do to my data.  It looks like a restart from this point would be as easy as "reboot" or "powerdown" though.

Share this post


Link to post

No access to the webUI, only unMenu. I know there were times when the webUI would get OOM killed that unMenu would still be up and you could use it to get some information back, but you couldn't actually operate unRAID since unMenu just put the webUI in a frame for you and let it do all the normal stuff.

Share this post


Link to post

Is the original disk truly dead? If not maybe it could be another way to try to get some of its files back if you have some problems with the rebuild.

Share this post


Link to post

The original disk is truly dead.

You're right about unmenu.  It can do some stuff but loads unRAID in an unresponsive frame.  Looks like that restart is the next step.  Running sync again won't be a problem.

Share this post


Link to post

After restart, everything is in place to just start it again, so that's my next step.  I was worried about data loss.  I should be good from here.  Thanks!

Share this post


Link to post

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this