Help.... docker issue / cache drive corruption ?


Recommended Posts

One of my dockers (crashplan) was no longer working, so I restarted it..  That did not work, I pressed EDIT and save right away... That has resulted in an "orphaned image" in my docker screen..

 

Another docker was no longer fucntioning (plex), I stopped it and it does not want to restart.

 

Checked syslog, is full of:  BTRFS (device loop0): parent transid verify failed

 

What is my next move ?

 

Jan 19 15:42:53 Tower kernel: BTRFS (device loop0): parent transid verify failed on 233373696 wanted 524277 found 523128
Jan 19 15:42:53 Tower kernel: BTRFS (device loop0): parent transid verify failed on 233373696 wanted 524277 found 523128
Jan 19 15:42:53 Tower kernel: BTRFS (device loop0): parent transid verify failed on 233373696 wanted 524277 found 523128
Jan 19 15:42:53 Tower kernel: BTRFS (device loop0): parent transid verify failed on 233373696 wanted 524277 found 523128
Jan 19 15:42:53 Tower kernel: BTRFS (device loop0): parent transid verify failed on 233373696 wanted 524277 found 523128
Jan 19 15:42:54 Tower kernel: BTRFS (device loop0): parent transid verify failed on 233373696 wanted 524277 found 523128
Jan 19 15:42:54 Tower kernel: BTRFS (device loop0): parent transid verify failed on 233373696 wanted 524277 found 523128
Jan 19 15:42:54 Tower kernel: BTRFS (device loop0): parent transid verify failed on 233373696 wanted 524277 found 523128
Jan 19 15:42:54 Tower kernel: BTRFS (device loop0): parent transid verify failed on 233373696 wanted 524277 found 523128
Jan 19 15:42:59 Tower kernel: verify_parent_transid: 52 callbacks suppressed
Jan 19 15:42:59 Tower kernel: BTRFS (device loop0): parent transid verify failed on 233373696 wanted 524277 found 523128
Jan 19 15:42:59 Tower kernel: BTRFS (device loop0): parent transid verify failed on 233373696 wanted 524277 found 523128
Jan 19 15:42:59 Tower kernel: BTRFS (device loop0): parent transid verify failed on 233373696 wanted 524277 found 523128
Jan 19 15:42:59 Tower kernel: BTRFS (device loop0): parent transid verify failed on 233373696 wanted 524277 found 523128
Jan 19 15:42:59 Tower kernel: BTRFS (device loop0): parent transid verify failed on 233373696 wanted 524277 found 523128
Jan 19 15:42:59 Tower kernel: BTRFS (device loop0): parent transid verify failed on 233373696 wanted 524277 found 523128
Jan 19 15:42:59 Tower kernel: BTRFS (device loop0): parent transid verify failed on 233373696 wanted 524277 found 523128
Jan 19 15:42:59 Tower kernel: BTRFS (device loop0): parent transid verify failed on 233373696 wanted 524277 found 523128
Jan 19 15:42:59 Tower kernel: BTRFS (device loop0): parent transid verify failed on 233373696 wanted 524277 found 523128
Jan 19 15:42:59 Tower kernel: BTRFS (device loop0): parent transid verify failed on 233373696 wanted 524277 found 523128
Jan 19 15:43:04 Tower kernel: verify_parent_transid: 64 callbacks suppressed
Jan 19 15:43:04 Tower kernel: BTRFS (device loop0): parent transid verify failed on 233373696 wanted 524277 found 523128
Jan 19 15:43:04 Tower kernel: BTRFS (device loop0): parent transid verify failed on 233373696 wanted 524277 found 523128
Jan 19 15:43:04 Tower kernel: BTRFS (device loop0): parent transid verify failed on 233373696 wanted 524277 found 523128
Jan 19 15:43:04 Tower kernel: BTRFS (device loop0): parent transid verify failed on 233373696 wanted 524277 found 523128
Jan 19 15:43:05 Tower kernel: BTRFS (device loop0): parent transid verify failed on 233373696 wanted 524277 found 523128
Jan 19 15:43:05 Tower kernel: BTRFS (device loop0): parent transid verify failed on 233373696 wanted 524277 found 523128
Jan 19 15:43:05 Tower kernel: BTRFS (device loop0): parent transid verify failed on 233373696 wanted 524277 found 523128
Jan 19 15:43:05 Tower kernel: BTRFS (device loop0): parent transid verify failed on 233373696 wanted 524277 found 523128
Jan 19 15:43:05 Tower kernel: BTRFS (device loop0): parent transid verify failed on 233373696 wanted 524277 found 523128
Jan 19 15:43:05 Tower kernel: BTRFS (device loop0): parent transid verify failed on 233373696 wanted 524277 found 523128
Jan 19 15:43:08 Tower php: /usr/local/emhttp/plugins/dynamix.docker.manager/scripts/docker 'stop' 'PlexMediaServer'
Jan 19 15:43:16 Tower php: /usr/local/emhttp/plugins/dynamix.docker.manager/scripts/docker 'start' 'PlexMediaServer'
Jan 19 15:43:24 Tower emhttp: cmd: /usr/local/emhttp/plugins/dynamix.docker.manager/scripts/docker logs --tail=350 -f PlexMediaServer
Jan 19 15:43:31 Tower emhttp: cmd: /usr/local/emhttp/plugins/dynamix/scripts/tail_log syslog
Jan 19 15:44:13 Tower php: /usr/local/emhttp/plugins/dynamix.docker.manager/scripts/docker 'start' 'PlexMediaServer'
Jan 19 15:45:18 Tower in.telnetd[9371]: connect from 192.168.1.36 (192.168.1.36)
Jan 19 15:45:24 Tower login[9372]: ROOT LOGIN on '/dev/pts/1' from '192.168.1.36'
Jan 19 15:46:51 Tower php: /usr/local/emhttp/plugins/dynamix/scripts/btrfs_scrub 'start' '/mnt/cache' '-r'
Jan 19 15:47:02 Tower autofan: Highest disk temp is 37°C, adjusting fan speed from: 75 (29% @ 704rpm) to: 50 (19% @ 700rpm)

 

This does on and on..

 

It started this morning apparently:

 

Jan 19 08:27:03 Tower kernel: blk_update_request: I/O error, dev loop0, sector 0
Jan 19 08:27:03 Tower kernel: BTRFS: bdev /dev/loop0 errs: wr 0, rd 0, flush 1, corrupt 0, gen 0
Jan 19 08:27:03 Tower kernel: BTRFS: error (device loop0) in write_all_supers:3498: errno=-5 IO failure (errors while submitting device barriers.)
Jan 19 08:27:03 Tower kernel: BTRFS info (device loop0): forced readonly
Jan 19 08:27:03 Tower kernel: BTRFS warning (device loop0): Skipping commit of aborted transaction.
Jan 19 08:27:03 Tower kernel: ------------[ cut here ]------------
Jan 19 08:27:03 Tower kernel: WARNING: CPU: 2 PID: 27409 at fs/btrfs/super.c:260 __btrfs_abort_transaction+0x4d/0x10e()
Jan 19 08:27:03 Tower kernel: BTRFS: Transaction aborted (error -5)
Jan 19 08:27:03 Tower kernel: Modules linked in: xt_CHECKSUM iptable_mangle ipt_REJECT nf_reject_ipv4 ebtable_filter ebtables kvm_intel kvm vhost_net vhost macvtap macvlan tun xt_nat veth ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_conntrack_ipv4 nf_nat_ipv4 iptable_filter ip_tables nf_nat md_mod nct6775 hwmon_vid coretemp e1000e mvsas ahci libsas ptp i2c_i801 libahci scsi_transport_sas pps_core ipmi_si
Jan 19 08:27:03 Tower kernel: CPU: 2 PID: 27409 Comm: btrfs-transacti Not tainted 4.1.13-unRAID #1
Jan 19 08:27:03 Tower kernel: Hardware name: Supermicro X9SCL/X9SCM/X9SCL/X9SCM, BIOS 2.0b 09/17/2012
Jan 19 08:27:03 Tower kernel: 0000000000000009 ffff8803f32bbc58 ffffffff815f12b0 0000000000000000
Jan 19 08:27:03 Tower kernel: ffff8803f32bbca8 ffff8803f32bbc98 ffffffff8104775b ffff8803f32bbce8
Jan 19 08:27:03 Tower kernel: ffffffff81286a7c 00000000fffffffb ffff88041a4e7800 ffff8803f2dc92c0
Jan 19 08:27:03 Tower kernel: Call Trace:
Jan 19 08:27:03 Tower kernel: [<ffffffff815f12b0>] dump_stack+0x4c/0x6e
Jan 19 08:27:03 Tower kernel: [<ffffffff8104775b>] warn_slowpath_common+0x97/0xb1
Jan 19 08:27:03 Tower kernel: [<ffffffff81286a7c>] ? __btrfs_abort_transaction+0x4d/0x10e
Jan 19 08:27:03 Tower kernel: [<ffffffff810477b6>] warn_slowpath_fmt+0x41/0x43
Jan 19 08:27:03 Tower kernel: [<ffffffff81286a7c>] __btrfs_abort_transaction+0x4d/0x10e
Jan 19 08:27:03 Tower kernel: [<ffffffff812ac0b9>] cleanup_transaction+0x80/0x21d
Jan 19 08:27:03 Tower kernel: [<ffffffff810724cb>] ? wait_woken+0x7d/0x7d
Jan 19 08:27:03 Tower kernel: [<ffffffff812ad323>] btrfs_commit_transaction+0xa6c/0xa81
Jan 19 08:27:03 Tower kernel: [<ffffffff812a90d8>] transaction_kthread+0xfa/0x1cb
Jan 19 08:27:03 Tower kernel: [<ffffffff812a8fde>] ? btrfs_cleanup_transaction+0x461/0x461
Jan 19 08:27:03 Tower kernel: [<ffffffff8105c71a>] kthread+0xd6/0xde
Jan 19 08:27:03 Tower kernel: [<ffffffff8105c644>] ? kthread_create_on_node+0x172/0x172
Jan 19 08:27:03 Tower kernel: [<ffffffff815f6d92>] ret_from_fork+0x42/0x70
Jan 19 08:27:03 Tower kernel: [<ffffffff8105c644>] ? kthread_create_on_node+0x172/0x172
Jan 19 08:27:03 Tower kernel: ---[ end trace 8cb68d7bf330c30a ]---
Jan 19 08:27:03 Tower kernel: BTRFS: error (device loop0) in cleanup_transaction:1692: errno=-5 IO failure
Jan 19 08:27:03 Tower kernel: BTRFS info (device loop0): delayed_refs has NO entry

 

Full syslog included.

 

I am running a preclear and am reluctant to reboot.

syslog.zip

Link to comment

Haven't looked at the syslog yet but you really should post Tools - Diagnostics instead.

 

What do you have on the Dashboard for docker space (under System Status the numbers labeled flash : log : docker)?

 

3%, 25% and 51%

 

The diagnostics zip is to large to attach unfortunately, which files do you need?

Link to comment

In addition to what you have posted your syslog also has a lot of this from the beginning:

Jan 15 17:06:41 Tower shfs/user: share cache full

Probably your docker.img is corrupt which is the dev/loop0 stuff, but that other makes me think you also have one or more corrupt btrfs disks. Possibly all this is limited to the cache drive.

 

No personal experience with btrfs repair. Let us know what you find.

Link to comment

Also ran BTRFS scrub on docker, 0 errors

 

I turned of docker on a whole..

 

Log shows:

 

Jan 19 18:43:31 Tower php: /usr/local/emhttp/plugins/dynamix.docker.manager/even                                                       t/stopping_svcs
Jan 19 18:43:31 Tower logger: stopping docker ...
Jan 19 18:43:32 Tower logger: waiting for docker to die...
Jan 19 18:43:33 Tower logger: waiting for docker to die...
Jan 19 18:43:34 Tower logger: waiting for docker to die...
Jan 19 18:43:35 Tower logger: waiting for docker to die...
Jan 19 18:43:36 Tower logger: waiting for docker to die...
Jan 19 18:43:37 Tower logger: waiting for docker to die...
Jan 19 18:43:38 Tower logger: waiting for docker to die...
Jan 19 18:43:39 Tower logger: waiting for docker to die...
Jan 19 18:43:40 Tower logger: waiting for docker to die...
Jan 19 18:43:41 Tower logger: waiting for docker to die...
Jan 19 18:43:42 Tower logger: waiting for docker to die...
Jan 19 18:43:43 Tower logger: waiting for docker to die...
Jan 19 18:43:44 Tower logger: waiting for docker to die...
Jan 19 18:43:45 Tower logger: waiting for docker to die...
Jan 19 18:44:06 Tower logger: unmounting docker loopback
Jan 19 18:44:06 Tower emhttp: cmd: /usr/local/emhttp/plugins/dynamix/scripts/tai                                                       l_log syslog
Jan 19 18:44:06 Tower kernel: BTRFS warning (device loop0): page private not zer                                                       o on page 233373696
Jan 19 18:44:06 Tower kernel: BTRFS warning (device loop0): page private not zer                                                       o on page 233377792
Jan 19 18:44:06 Tower kernel: BTRFS warning (device loop0): page private not zer                                                       o on page 233381888
Jan 19 18:44:06 Tower kernel: BTRFS warning (device loop0): page private not zer                                                       o on page 233385984

 

Now enabling docker again..

 

Log shows:

 

Jan 19 18:47:18 Tower php: /usr/local/emhttp/plugins/dynamix.docker.manager/event/started
Jan 19 18:47:18 Tower kernel: BTRFS info (device loop0): disk space caching is enabled
Jan 19 18:47:18 Tower kernel: BTRFS: has skinny extents
Jan 19 18:47:19 Tower logger: Resize '/var/lib/docker' of 'max'
Jan 19 18:47:19 Tower kernel: BTRFS: new size for /dev/loop0 is 26843545600
Jan 19 18:47:19 Tower logger: starting docker ...
Jan 19 18:47:47 Tower logger: CrashPlan: started succesfully!
Jan 19 18:47:47 Tower kernel: device vethc7398cb entered promiscuous mode
Jan 19 18:47:47 Tower kernel: docker0: port 1(vethc7398cb) entered forwarding state
Jan 19 18:47:47 Tower kernel: docker0: port 1(vethc7398cb) entered forwarding state
Jan 19 18:47:47 Tower avahi-daemon[27136]: Withdrawing workstation service for vethcaa6520.
Jan 19 18:47:47 Tower kernel: docker0: port 1(vethc7398cb) entered disabled state
Jan 19 18:47:47 Tower kernel: eth0: renamed from vethcaa6520
Jan 19 18:47:47 Tower kernel: docker0: port 1(vethc7398cb) entered forwarding state
Jan 19 18:47:47 Tower kernel: docker0: port 1(vethc7398cb) entered forwarding state
Jan 19 18:47:47 Tower logger: Dolphin: started succesfully!
Jan 19 18:47:48 Tower kernel: device vethb623ef2 entered promiscuous mode
Jan 19 18:47:48 Tower kernel: docker0: port 2(vethb623ef2) entered forwarding state
Jan 19 18:47:48 Tower kernel: docker0: port 2(vethb623ef2) entered forwarding state
Jan 19 18:47:48 Tower avahi-daemon[27136]: Withdrawing workstation service for veth4ae3cbf.
Jan 19 18:47:48 Tower kernel: eth0: renamed from veth4ae3cbf
Jan 19 18:47:48 Tower logger: Transmission: started succesfully!
Jan 19 18:47:48 Tower kernel: device vethf37cd83 entered promiscuous mode
Jan 19 18:47:48 Tower kernel: docker0: port 3(vethf37cd83) entered forwarding state
Jan 19 18:47:48 Tower kernel: docker0: port 3(vethf37cd83) entered forwarding state
Jan 19 18:47:48 Tower avahi-daemon[27136]: Withdrawing workstation service for veth752ec5a.
Jan 19 18:47:48 Tower kernel: docker0: port 3(vethf37cd83) entered disabled state
Jan 19 18:47:48 Tower kernel: eth0: renamed from veth752ec5a
Jan 19 18:47:48 Tower kernel: docker0: port 3(vethf37cd83) entered forwarding state
Jan 19 18:47:48 Tower kernel: docker0: port 3(vethf37cd83) entered forwarding state
Jan 19 18:47:48 Tower logger: SickRage: started succesfully!
Jan 19 18:47:49 Tower logger: PlexMediaServer: started succesfully!
Jan 19 18:47:50 Tower kernel: device veth65bdaba entered promiscuous mode
Jan 19 18:47:50 Tower kernel: docker0: port 4(veth65bdaba) entered forwarding state
Jan 19 18:47:50 Tower kernel: docker0: port 4(veth65bdaba) entered forwarding state
Jan 19 18:47:50 Tower avahi-daemon[27136]: Withdrawing workstation service for vethb671968.
Jan 19 18:47:50 Tower kernel: docker0: port 4(veth65bdaba) entered disabled state
Jan 19 18:47:50 Tower kernel: eth0: renamed from vethb671968
Jan 19 18:47:50 Tower kernel: docker0: port 4(veth65bdaba) entered forwarding state
Jan 19 18:47:50 Tower kernel: docker0: port 4(veth65bdaba) entered forwarding state
Jan 19 18:47:50 Tower logger: Deluge: started succesfully!
Jan 19 18:47:50 Tower kernel: device veth02f00c7 entered promiscuous mode
Jan 19 18:47:50 Tower kernel: docker0: port 5(veth02f00c7) entered forwarding state
Jan 19 18:47:50 Tower kernel: docker0: port 5(veth02f00c7) entered forwarding state
Jan 19 18:47:50 Tower avahi-daemon[27136]: Withdrawing workstation service for veth70ba63e.
Jan 19 18:47:50 Tower kernel: eth0: renamed from veth70ba63e
Jan 19 18:47:52 Tower logger: CouchPotatoMovies: started succesfully!
Jan 19 18:47:52 Tower ntpd[1625]: Listen normally on 4 docker0 172.17.42.1:123
Jan 19 18:47:52 Tower ntpd[1625]: new interface(s) found: waking up resolver
Jan 19 18:47:52 Tower kernel: device veth55e1bae entered promiscuous mode
Jan 19 18:47:52 Tower kernel: docker0: port 6(veth55e1bae) entered forwarding state
Jan 19 18:47:52 Tower kernel: docker0: port 6(veth55e1bae) entered forwarding state
Jan 19 18:47:52 Tower avahi-daemon[27136]: Withdrawing workstation service for vethe9c7267.
Jan 19 18:47:52 Tower kernel: docker0: port 6(veth55e1bae) entered disabled state
Jan 19 18:47:52 Tower kernel: eth0: renamed from vethe9c7267
Jan 19 18:47:52 Tower kernel: docker0: port 6(veth55e1bae) entered forwarding state
Jan 19 18:47:52 Tower kernel: docker0: port 6(veth55e1bae) entered forwarding state
Jan 19 18:47:52 Tower logger: SABnzbd: started succesfully!
Jan 19 18:48:02 Tower kernel: docker0: port 1(vethc7398cb) entered forwarding state
Jan 19 18:48:03 Tower kernel: docker0: port 2(vethb623ef2) entered forwarding state
Jan 19 18:48:03 Tower kernel: docker0: port 3(vethf37cd83) entered forwarding state
Jan 19 18:48:05 Tower kernel: docker0: port 4(veth65bdaba) entered forwarding state
Jan 19 18:48:05 Tower kernel: docker0: port 5(veth02f00c7) entered forwarding state
Jan 19 18:48:07 Tower kernel: docker0: port 6(veth55e1bae) entered forwarding state
Jan 19 18:48:09 Tower logger:  Updating templates...  Updating info...  Done.

 

Dockers look like they are functioning again..

 

Sofar no weird messages... Combined with the fact that scrub found no corruption.. I am thinking that the docker process itself got itself in trouble.. And restarting that process fixed it.. ?

 

The second scrub of the cache drive (after closing down the dockers) also shows no errors..

Link to comment

Looks like your docker image was forced read-only due to errors. When you restarted the docker service it was likely remounted read-write again and "fixed" your issue. I would either run a btrfs check on the docker image, or just recreate it from scratch to make sure it doesn't still have issues (it probably does).

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.