Jump to content

Docker service failed to start + Libvirt service failed to start


Recommended Posts

So I went to sleep while my server was downloading large files using Deluge. I woke up this morning and there was a warning that my 1 TB cache drive is 91% utilized to I clicked on mover to initiate it. Then, I saw a warning in the Fix Common Problems app that my log folder or file is full, a restart will fix this....

 

So, I made my biggest mistake which is I rebooted the array while mover was still running, after the restart... I have no Duckers because the docker page says "Docker service failed to start"... also, I actually use no VM's but in the VM page it also says "Libvirt failed to start" 

 

My cache drive is still 860 GB used out of 940 GB and whenever I click on mover now, nothing happens that Fix Common Problems app tells me " Unable to write to cache .... Drive mounted read only or completely full"

 

I have attached my Syslog & the Cache Disk information Log as well....

 

I am seriously freaking out 

 

Please help me !! 

Syslog and cache disk log information.zip

Link to comment

Next time please post complete diagnostics.

 

One of the cache devices (cache1) has been having (and still is) read/write errors:

Sep 27 11:21:29 Tower kernel: BTRFS info (device sdc1): bdev /dev/sdb1 errs: wr 26504, rd 62595, flush 0, corrupt 0, gen 0

See here for the future to better monitor pool.

 

Replace cables on that device, power back on, try running a scrub on the pool and post new diags, but likely pool filesystem is corrupted.

 

 

 

 

Link to comment
14 minutes ago, johnnie.black said:

Next time please post complete diagnostics.

 

One of the cache devices (cache1) has been having (and still is) read/write errors:


Sep 27 11:21:29 Tower kernel: BTRFS info (device sdc1): bdev /dev/sdb1 errs: wr 26504, rd 62595, flush 0, corrupt 0, gen 0

See here for the future to better monitor pool.

 

Replace cables on that device, power back on, try running a scrub on the pool and post new diags, but likely pool filesystem is corrupted.

 

 

 

 

Sorry... I am really not familiar with the troubleshooting process... here is the diagnostic file before running scrubs or anything 

 

I appreciate you having the time to look at it 

tower-diagnostics-20190927-1018.zip

Link to comment
On 9/27/2019 at 2:23 PM, johnnie.black said:

You'll need to back up cache pool and re-format, but recommend only doing that after replacing the cables.

Ok that makes sense... but here is the problem now... after telling me yesterday that I need backup my cache pool and replace cables... I came home today and i wanted to check if the current cables are connected well and ok "just to make sure before I go ahead and replace them" 

 

So I disconnected both SSD cache drive and connected them again... made sure that the other end of the cables are connected well to the motherboard... then started my unraid server to see if the problem went away or not... if not then the plan was to backup my cache pool and replace the cables...But, I came across this problem now which is... unraid took of one of the SSD cache drives and put it under unassigned devices and it says the size is 16 kb?!!

 

the other cache drive was still in its place and reflects the correct information in terms of size, temperature and even the name 

 

Please have a look at the attached picture 

Cache.PNG

Link to comment
15 minutes ago, johnnie.black said:

That would suggest the SSD failed, try it on a different SATA port to confirm.

Yup... I just changed the SATA port is you suggested and it is still the same situation as shown in the picture above 

 

What do I do now? I know that means the cache pool is no longer usable... but is there any way to get anything out of it!! 

what about the second SSD... can i get anything out of it?? my mistake is that I made the both in a pool...as far as I know if one failed that means the data inside both of them is pretty much gone forever, is that correct? 

Link to comment
On 9/29/2019 at 12:22 PM, johnnie.black said:

Pool was raid1, so most data should still be available, though likely not all because of the filesystem corruption, start the pool with just the remaining device, if it doesn't mount see here for some recovery options.

unfortunately, they were both in raid zero... I dont think they are available anymore  

Link to comment
On 10/1/2019 at 11:21 AM, livingonline8 said:

unfortunately, they were both in raid zero..

Pool was raid1

 

Correction, metadata was raid1 but data was single, you still might be able to recover some data with btrfs restore, but likely mostly incomplete/corrupt.

Edited by johnnie.black
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...