UnRaid Froze... couldn't even Ping... cold reboot... ALL DATA GONE!


egeis

Recommended Posts

I'm starting to lose my mind here. The efforts it took to learn about this software (6.1.9), get everything set up, Dockers installed, move my data, get Windows VMs operating with scsi and gpu passthrouth, etc... I invested more than 2 grand into this machine and at least 3 months of research and all I have are headaches.

 

The UnRaid OS froze yesterday. Nothing worked. I couldn't even ping the server, nevermind SSH in and try a proper powerdown. So what can you do at that point?? Well, I reset the machine.

 

I put it into maintenance mode, ran a parity check, everything came back with no errors.

 

Now I'm in the WebGUI, all my disks are fine, all the users are there... but NO SHARES, NO VMs, NO DOCKERS...  ...but somehow my plugins are there... ??  I need the shares, the Dockers, and the VM back. I don't want to repeat the agony of this...

 

If I seriously just lost ALL OF MY DATA, I'm gonna go psychotic...  What in the world am I supposed to do?

Link to comment

First thing!  Stop Panicking!   More people have lost data through a panic attack than any other single way!  Take a deep breath at this point....  If your Disks are fine, you haven't lost any data!  (Those data disks are written with a file format that ANY Linux computer can read.)

 

Get a diagnostics file.  'Tools' >>>  'Diagnostics'.  Upload that file with your next post. That's so that the Gurus who can read and interpret syslogs can look at it. 

 

Now, look at the 'Array Operation' tab on the 'Main' page and see if the array has started.  If it is in the 'Maintenance' mode, you will have to get out of that. 

Link to comment

I'm starting to lose my mind here. The efforts it took to learn about this software (6.1.9), get everything set up, Dockers installed, move my data, get Windows VMs operating with scsi and gpu passthrouth, etc... I invested more than 2 grand into this machine and at least 3 months of research and all I have are headaches.

 

The UnRaid OS froze yesterday. Nothing worked. I couldn't even ping the server, nevermind SSH in and try a proper powerdown. So what can you do at that point?? Well, I reset the machine.

 

I put it into maintenance mode, ran a parity check, everything came back with no errors.

 

Now I'm in the WebGUI, all my disks are fine, all the users are there... but NO SHARES, NO VMs, NO DOCKERS...  ...but somehow my plugins are there... ??  I need the shares, the Dockers, and the VM back. I don't want to repeat the agony of this...

 

If I seriously just lost ALL OF MY DATA, I'm gonna go psychotic...  What in the world am I supposed to do?

 

Relax... Maintenance modes starts the array but does not mount your drives..  So what you are seeing is normal. Start the array in normal mode and your data should be back.

Link to comment

Thanks for the pep talk...  ;)  I'll try and use emojis to mask my deep, emotional internal conflicts...

 

I got out of maintenance mode and tried loading the cache disks one at a time... both are coming up "unmountable"

 

I kept one unassigned and ran:

btrfs restore -v /dev/sdb1 /mnt/disk1/cachefolder

 

This is currently operating...  it keeps asking about "looping alot" on each docker... so thank god the data is there... though I'm completely confused as to how I can put it back and expect that unRaid will recognize it...

 

After doing this, I see my normal disk shares again, and even the VM even though it was built as "cache-only" and there are no mounted cache shares... ???  This is always so confusing...

 

So here's the syslog when trying to mount the cache disks:

 

Dec 11 09:53:53 UnRaidTower logger: log =internal bsize=4096 blocks=357694, version=2
Dec 11 09:53:53 UnRaidTower logger: = sectsz=512 sunit=0 blks, lazy-count=1
Dec 11 09:53:53 UnRaidTower logger: realtime =none extsz=4096 blocks=0, rtextents=0
Dec 11 09:53:53 UnRaidTower emhttp: shcmd (116): mkdir -p /mnt/cache
Dec 11 09:53:53 UnRaidTower emhttp: mount error: Too many misplaced devices: 1
Dec 11 09:53:53 UnRaidTower emhttp: shcmd (117): rmdir /mnt/cache
Dec 11 09:53:53 UnRaidTower emhttp: shcmd (118): sync
Dec 11 09:53:53 UnRaidTower emhttp: shcmd (119): mkdir /mnt/user0
Dec 11 09:53:53 UnRaidTower emhttp: shcmd (120): /usr/local/sbin/shfs /mnt/user0 -disks 14 -o noatime,big_writes,allow_other |& logger
Dec 11 09:53:53 UnRaidTower emhttp: shcmd (121): mkdir /mnt/user
Dec 11 09:53:53 UnRaidTower emhttp: shcmd (122): /usr/local/sbin/shfs /mnt/user -disks 15 2048000000 -o noatime,big_writes,allow_other -o remember=0 |& logger
Dec 11 09:53:53 UnRaidTower emhttp: shcmd (123): cat - > /boot/config/plugins/dynamix/mover.cron <<< "# Generated mover schedule:#01240 3 * * * /usr/local/sbin/mover |& logger#012"
Dec 11 09:53:53 UnRaidTower emhttp: shcmd (124): /usr/local/sbin/update_cron &> /dev/null
Dec 11 09:53:53 UnRaidTower sudo: root : TTY=unknown ; PWD=/ ; USER=root ; COMMAND=/usr/bin/env python /usr/bin/denyhosts.py --daemon --config=/boot/config/plugins/denyhosts/denyhosts.cfg
Dec 11 09:53:53 UnRaidTower emhttp: shcmd (125): :>/etc/samba/smb-shares.conf
Dec 11 09:53:53 UnRaidTower avahi-daemon[13455]: Files changed, reloading.
Dec 11 09:53:53 UnRaidTower emhttp: Restart SMB...
Dec 11 09:53:53 UnRaidTower emhttp: shcmd (126): killall -HUP smbd
Dec 11 09:53:53 UnRaidTower emhttp: shcmd (127): cp /etc/avahi/services/smb.service- /etc/avahi/services/smb.service
Dec 11 09:53:53 UnRaidTower avahi-daemon[13455]: Files changed, reloading.
Dec 11 09:53:53 UnRaidTower avahi-daemon[13455]: Service group file /services/smb.service changed, reloading.
Dec 11 09:53:53 UnRaidTower emhttp: shcmd (128): pidof rpc.mountd &> /dev/null
Dec 11 09:53:53 UnRaidTower emhttp: shcmd (129): /etc/rc.d/rc.atalk status
Dec 11 09:53:53 UnRaidTower emhttp: Starting Docker...
Dec 11 09:53:53 UnRaidTower logger: Not starting Docker: new image file path doesn't exist
Dec 11 09:53:53 UnRaidTower kernel: EXT4-fs (loop0): mounted filesystem with ordered data mode. Opts: (null)
Dec 11 09:53:53 UnRaidTower emhttp: Starting libvirt...
Dec 11 09:53:53 UnRaidTower logger: Starting libvirtd...
Dec 11 09:53:53 UnRaidTower dnsmasq[12072]: read /etc/hosts - 1 addresses
Dec 11 09:53:53 UnRaidTower dnsmasq[12072]: read /var/lib/libvirt/dnsmasq/default.addnhosts - 0 addresses
Dec 11 09:53:53 UnRaidTower dnsmasq-dhcp[12072]: read /var/lib/libvirt/dnsmasq/default.hostsfile
Dec 11 09:53:54 UnRaidTower avahi-daemon[13455]: Service "UnRaidTower" (/services/smb.service) successfully established.
Dec 11 09:54:03 UnRaidTower emhttp: Spinning up all drives...
Dec 11 09:54:03 UnRaidTower emhttp: shcmd (130): /usr/sbin/hdparm -S0 /dev/sdc &> /dev/null
Dec 11 09:54:03 UnRaidTower kernel: mdcmd (21): spinup 0
Dec 11 09:54:03 UnRaidTower kernel: mdcmd (22): spinup 1
Dec 11 09:54:03 UnRaidTower kernel: mdcmd (23): spinup 2
Dec 11 09:54:03 UnRaidTower kernel: mdcmd (24): spinup 3

 

The dagnostics zip file is attached. I have no idea what you'd like to see from that...

unraidtower-diagnostics-20161211-1028.zip

Link to comment

GREAT NEWS: After running btrfs recover and assigning the drives one at a time, then assigning them both again... everything is suddenly back.  :o  ...I'm always a little confused about when computers act like they have a mind of their own and you just have to poke some spots here and there and then everything just starts working again.

 

There was a point that this happened before... very early in the process, so I didn't care too much about rebuilding from scratch.

 

I rebooted unRaid and only one of the two disks in the cache pool came back, albeit "unmountable". I made the mistake of formatting the "unmountable" cache drive, not realizing I was formatting the one that was listed as assigned. I lost everything.

 

So I assigned the two. Rebuilt the Docker and VM software. Everything was kosher until yesterday when the server froze and went offline.

 

This morning I found all drives mounted but no shares. Did the reboot. Saw all disk shares but no cache shares. Cache shares were there but were listed with zero reads and writes and no errors. This is when I panicked.

 

I'm glad I got the btrfs recover running.  I tried to mount them one at a time and this must have created the "change in the cache pool" that you're seeing. I never did anything but assign the drives to the cache and let unRaid do its magic. So as for changing from RAID-1 to RAID-0... I have no idea how that might have happened.

 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.