Jump to content

eeking

Members
  • Posts

    55
  • Joined

  • Last visited

Everything posted by eeking

  1. I just want to say I don't think you're crazy. What you've outlined seems to make perfect sense to me as well. Successful recovery would have to assume that no bitrot had happened on other discs in the same blocks, or that parity hadn't been updated during a parity check while the bitrot was active. In fact, if there was a system like you mentioned implemented, it would be a good idea for the parity check to consult block checksums on the data disks before deciding to update parity when a difference is detected. The problem that I think others were trying to get at is that, currently, the parity part of unraid software is "dumb" and doesn't know anything about filesystems. Seems to me like the recovery method you've described could be implemented as an add-on/plugin just by doing raw reads of the parity and data drives once you've consulted the filesystem and partition info to get the file's actual location on the target disk. And it seems like it would be safest to write out a new copy of the file and delete the old one rather than try to directly correct the flipped bits on disk. The problem is this would require precise knowledge of the structure of the xfs/zfs/btrfs filesystems to isolate just the data and not the metadata. If you tried to just correct the data bits in place, the parity system would flip the bits on the parity drive and parity would be out of sync with the corrected data. (if understand correctly, when writing to a disk, the parity system checks the current value on the data disk - which would have been flipped due to rot - and compares it to the new value. It then flips the bit on parity if the value changed on the data disk. That's how it avoids needing to spin up all disks when you write to just one.)
  2. I bought a pre-built system from Limetech back in 2007, the Lian-Li PC-A16 case with 15 hot-swap bays in the front. The system has been in storage for roughly the past 3 years through a move, and today when I decided to get it out and power it up I ran into some issues. Maybe power supply, I'm not certain. After being on for about an hour working on a parity check it just abruptly shut off. Now it generally will not power on at all, though it may come on for a few seconds before shutting down again - sometimes basically immediately, sometimes getting all the way to the USB's unraid boot menu before shutting off again. Given the age of the system, it's probably worth rebuilding rather than just fixing whatever this isolated issue may be. But is anything worth saving? The case and SATA drive bays? I have experience building PCs, but nothing quite like this. And life has happened so it's been a good while since I've built anything and I'm not up to speed on current hardware trends. I'm attaching the "system" folder from an old diagnostics run on the USB, since I can't access the live system to pull info about the hardware. It looks like two Promise PDC40718 PCI cards provided four SATA ports each - those are probably worth replacing with something modern - and the motherboard provided the other 7. There's nothing else particularly special about the system. I only ever really used it for media storage. I'll probably want to run Plex or some DLNA server docker once I get it running again, primarily for local use. TIA for any input! system.zip
  3. Unraid 6.6.7 Currently having some problems getting docker to start. I thought maybe the image file was corrupt or my cache drive had problems, so I've reformatted the cache and recreated the docker image file. Still the webui reports "Docker Service failed to start" and the docker log looks like this: time="2019-09-18T23:13:15.560042404-04:00" level=info msg="libcontainerd: started new docker-containerd process" pid=10204 time="2019-09-18T23:13:15.560204769-04:00" level=info msg="parsed scheme: \"unix\"" module=grpc time="2019-09-18T23:13:15.560228333-04:00" level=info msg="scheme \"unix\" not registered, fallback to default scheme" module=grpc time="2019-09-18T23:13:15.560334583-04:00" level=info msg="ccResolverWrapper: sending new addresses to cc: [{unix:///var/run/docker/containerd/docker-containerd.sock 0 <nil>}]" module=grpc time="2019-09-18T23:13:15.560374678-04:00" level=info msg="ClientConn switching balancer to \"pick_first\"" module=grpc time="2019-09-18T23:13:15.560470587-04:00" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc420184a20, CONNECTING" module=grpc time="2019-09-18T23:13:15-04:00" level=info msg="starting containerd" revision=468a545b9edcd5932818eb9de8e72413e616e86e version=v1.1.2 time="2019-09-18T23:13:15-04:00" level=info msg="loading plugin "io.containerd.content.v1.content"..." type=io.containerd.content.v1 time="2019-09-18T23:13:15-04:00" level=info msg="loading plugin "io.containerd.snapshotter.v1.btrfs"..." type=io.containerd.snapshotter.v1 time="2019-09-18T23:13:15-04:00" level=info msg="loading plugin "io.containerd.snapshotter.v1.aufs"..." type=io.containerd.snapshotter.v1 time="2019-09-18T23:13:15-04:00" level=warning msg="failed to load plugin io.containerd.snapshotter.v1.aufs" error="modprobe aufs failed: "modprobe: FATAL: Module aufs not found in directory /lib/modules/4.18.20-unRAID\n": exit status 1" time="2019-09-18T23:13:15-04:00" level=info msg="loading plugin "io.containerd.snapshotter.v1.native"..." type=io.containerd.snapshotter.v1 time="2019-09-18T23:13:15-04:00" level=info msg="loading plugin "io.containerd.snapshotter.v1.overlayfs"..." type=io.containerd.snapshotter.v1 time="2019-09-18T23:13:15-04:00" level=info msg="loading plugin "io.containerd.snapshotter.v1.zfs"..." type=io.containerd.snapshotter.v1 time="2019-09-18T23:13:15-04:00" level=warning msg="failed to load plugin io.containerd.snapshotter.v1.zfs" error="path /var/lib/docker/containerd/daemon/io.containerd.snapshotter.v1.zfs must be a zfs filesystem to be used with the zfs snapshotter" time="2019-09-18T23:13:15-04:00" level=info msg="loading plugin "io.containerd.metadata.v1.bolt"..." type=io.containerd.metadata.v1 time="2019-09-18T23:13:15-04:00" level=warning msg="could not use snapshotter aufs in metadata plugin" error="modprobe aufs failed: "modprobe: FATAL: Module aufs not found in directory /lib/modules/4.18.20-unRAID\n": exit status 1" time="2019-09-18T23:13:15-04:00" level=warning msg="could not use snapshotter zfs in metadata plugin" error="path /var/lib/docker/containerd/daemon/io.containerd.snapshotter.v1.zfs must be a zfs filesystem to be used with the zfs snapshotter" time="2019-09-18T23:13:35.572502759-04:00" level=warning msg="grpc: addrConn.createTransport failed to connect to {unix:///var/run/docker/containerd/docker-containerd.sock 0 <nil>}. Err :connection error: desc = \"transport: error while dialing: dial unix:///var/run/docker/containerd/docker-containerd.sock: timeout\". Reconnecting..." module=grpc time="2019-09-18T23:13:35.572634843-04:00" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc420184a20, TRANSIENT_FAILURE" module=grpc time="2019-09-18T23:13:35.572890270-04:00" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc420184a20, CONNECTING" module=grpc time="2019-09-18T23:13:55.573014687-04:00" level=warning msg="grpc: addrConn.createTransport failed to connect to {unix:///var/run/docker/containerd/docker-containerd.sock 0 <nil>}. Err :connection error: desc = \"transport: error while dialing: dial unix:///var/run/docker/containerd/docker-containerd.sock: timeout\". Reconnecting..." module=grpc time="2019-09-18T23:13:55.573120481-04:00" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc420184a20, TRANSIENT_FAILURE" module=grpc time="2019-09-18T23:13:55.573354144-04:00" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc420184a20, CONNECTING" module=grpc time="2019-09-18T23:14:15.573428114-04:00" level=warning msg="Failed to dial unix:///var/run/docker/containerd/docker-containerd.sock: grpc: the connection is closing; please retry." module=grpc time="2019-09-18T23:14:30.562752369-04:00" level=warning msg="daemon didn't stop within 15 secs, killing it" module=libcontainerd pid=10204 Failed to connect to containerd: failed to dial "/var/run/docker/containerd/docker-containerd.sock": context deadline exceeded Any thoughts?
  4. Unraid 6.6.7 Currently having some problems getting docker to start. I thought maybe the image file was corrupt or my cache drive had problems, so I've reformatted the cache and recreated the docker image file. Still the webui reports "Docker Service failed to start" and the docker log looks like this: time="2019-09-18T23:13:15.560042404-04:00" level=info msg="libcontainerd: started new docker-containerd process" pid=10204 time="2019-09-18T23:13:15.560204769-04:00" level=info msg="parsed scheme: \"unix\"" module=grpc time="2019-09-18T23:13:15.560228333-04:00" level=info msg="scheme \"unix\" not registered, fallback to default scheme" module=grpc time="2019-09-18T23:13:15.560334583-04:00" level=info msg="ccResolverWrapper: sending new addresses to cc: [{unix:///var/run/docker/containerd/docker-containerd.sock 0 <nil>}]" module=grpc time="2019-09-18T23:13:15.560374678-04:00" level=info msg="ClientConn switching balancer to \"pick_first\"" module=grpc time="2019-09-18T23:13:15.560470587-04:00" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc420184a20, CONNECTING" module=grpc time="2019-09-18T23:13:15-04:00" level=info msg="starting containerd" revision=468a545b9edcd5932818eb9de8e72413e616e86e version=v1.1.2 time="2019-09-18T23:13:15-04:00" level=info msg="loading plugin "io.containerd.content.v1.content"..." type=io.containerd.content.v1 time="2019-09-18T23:13:15-04:00" level=info msg="loading plugin "io.containerd.snapshotter.v1.btrfs"..." type=io.containerd.snapshotter.v1 time="2019-09-18T23:13:15-04:00" level=info msg="loading plugin "io.containerd.snapshotter.v1.aufs"..." type=io.containerd.snapshotter.v1 time="2019-09-18T23:13:15-04:00" level=warning msg="failed to load plugin io.containerd.snapshotter.v1.aufs" error="modprobe aufs failed: "modprobe: FATAL: Module aufs not found in directory /lib/modules/4.18.20-unRAID\n": exit status 1" time="2019-09-18T23:13:15-04:00" level=info msg="loading plugin "io.containerd.snapshotter.v1.native"..." type=io.containerd.snapshotter.v1 time="2019-09-18T23:13:15-04:00" level=info msg="loading plugin "io.containerd.snapshotter.v1.overlayfs"..." type=io.containerd.snapshotter.v1 time="2019-09-18T23:13:15-04:00" level=info msg="loading plugin "io.containerd.snapshotter.v1.zfs"..." type=io.containerd.snapshotter.v1 time="2019-09-18T23:13:15-04:00" level=warning msg="failed to load plugin io.containerd.snapshotter.v1.zfs" error="path /var/lib/docker/containerd/daemon/io.containerd.snapshotter.v1.zfs must be a zfs filesystem to be used with the zfs snapshotter" time="2019-09-18T23:13:15-04:00" level=info msg="loading plugin "io.containerd.metadata.v1.bolt"..." type=io.containerd.metadata.v1 time="2019-09-18T23:13:15-04:00" level=warning msg="could not use snapshotter aufs in metadata plugin" error="modprobe aufs failed: "modprobe: FATAL: Module aufs not found in directory /lib/modules/4.18.20-unRAID\n": exit status 1" time="2019-09-18T23:13:15-04:00" level=warning msg="could not use snapshotter zfs in metadata plugin" error="path /var/lib/docker/containerd/daemon/io.containerd.snapshotter.v1.zfs must be a zfs filesystem to be used with the zfs snapshotter" time="2019-09-18T23:13:35.572502759-04:00" level=warning msg="grpc: addrConn.createTransport failed to connect to {unix:///var/run/docker/containerd/docker-containerd.sock 0 <nil>}. Err :connection error: desc = \"transport: error while dialing: dial unix:///var/run/docker/containerd/docker-containerd.sock: timeout\". Reconnecting..." module=grpc time="2019-09-18T23:13:35.572634843-04:00" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc420184a20, TRANSIENT_FAILURE" module=grpc time="2019-09-18T23:13:35.572890270-04:00" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc420184a20, CONNECTING" module=grpc time="2019-09-18T23:13:55.573014687-04:00" level=warning msg="grpc: addrConn.createTransport failed to connect to {unix:///var/run/docker/containerd/docker-containerd.sock 0 <nil>}. Err :connection error: desc = \"transport: error while dialing: dial unix:///var/run/docker/containerd/docker-containerd.sock: timeout\". Reconnecting..." module=grpc time="2019-09-18T23:13:55.573120481-04:00" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc420184a20, TRANSIENT_FAILURE" module=grpc time="2019-09-18T23:13:55.573354144-04:00" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc420184a20, CONNECTING" module=grpc time="2019-09-18T23:14:15.573428114-04:00" level=warning msg="Failed to dial unix:///var/run/docker/containerd/docker-containerd.sock: grpc: the connection is closing; please retry." module=grpc time="2019-09-18T23:14:30.562752369-04:00" level=warning msg="daemon didn't stop within 15 secs, killing it" module=libcontainerd pid=10204 Failed to connect to containerd: failed to dial "/var/run/docker/containerd/docker-containerd.sock": context deadline exceeded Any thoughts?
×
×
  • Create New...