Jump to content

Need help: Docker service failed to start [6.8.3]


Recommended Posts

Yesterday, out of sudden, I did receive errors on my cache pool (2x NVMe M.2 disks). The Unraid main page didn't report these errors even when disk2 of that pool went offline. Today I did restart the server, Unraid comes up, but can't start the docker service. In syslog I see lots of BTRFS errors but Unraid still does not show any problems.

 

It seems that the cache pool does not work any longer but Unraid is working as if nothing had happened.

 

What are the steps to get the cache pool - and Dockers and VMs - back into operation? Rebalance?

 

Diagnostics attached.

 

Many thanks in advance.

 

tower-diagnostics-20201102-0757.zip

Link to comment

Update: I could get Docker/VM services to start. I had to delete the docker.img file. This one was corrupt. All Dockers were reconstructed and are running currently.

 

BUT: BTRFS still shows errors on my cache pool. What do I need to fix these?

 

Many thanks in advance.

 

Edited by hawihoney
Link to comment

I'm running the stats regulary. That's why I saw the errors. But Unraid didn't notice the errors til now.

 

If I call stats that's the result:

 

[/dev/nvme1n1p1].write_io_errs 0
[/dev/nvme1n1p1].read_io_errs 0
[/dev/nvme1n1p1].flush_io_errs 0
[/dev/nvme1n1p1].corruption_errs 0
[/dev/nvme1n1p1].generation_errs 0
[/dev/nvme0n1p1].write_io_errs 0
[/dev/nvme0n1p1].read_io_errs 0
[/dev/nvme0n1p1].flush_io_errs 0
[/dev/nvme0n1p1].corruption_errs 0
[/dev/nvme0n1p1].generation_errs 0

Looking at syslog at the same time shows:

Nov  2 11:00:04 Tower kernel: BTRFS warning (device nvme0n1p1): csum failed root 5 ino 15050349 off 1114939392 csum 0x382b6324 expected csum 0x54474642 mirror 2
Nov  2 11:00:04 Tower kernel: BTRFS info (device nvme0n1p1): read error corrected: ino 15050349 off 1114939392 (dev /dev/nvme1n1p1 sector 534070312)
Nov  2 11:12:38 Tower kernel: BTRFS error (device nvme0n1p1): parent transid verify failed on 1481548267520 wanted 16496481 found 16461691
Nov  2 11:12:38 Tower kernel: BTRFS info (device nvme0n1p1): read error corrected: ino 0 off 1481548267520 (dev /dev/nvme1n1p1 sector 337334848)
Nov  2 11:12:38 Tower kernel: BTRFS info (device nvme0n1p1): read error corrected: ino 0 off 1481548271616 (dev /dev/nvme1n1p1 sector 337334856)
Nov  2 11:12:38 Tower kernel: BTRFS info (device nvme0n1p1): read error corrected: ino 0 off 1481548275712 (dev /dev/nvme1n1p1 sector 337334864)
Nov  2 11:12:38 Tower kernel: BTRFS info (device nvme0n1p1): read error corrected: ino 0 off 1481548279808 (dev /dev/nvme1n1p1 sector 337334872)

 

The link in your answer mentioned scrub. Is scrub another name for balance?

 

Many thanks in advance.

 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...