openam

April 23

So I just had something weird happen with this machine. A bunch of the dockers were stopped, and they wouldn't start. I disabled docker, and restarted it, and they seemed to all startup fine. I took some more diagnostics (attached). I just realized the last couple times I've tried to do this the UI won't serve them up as a download. I end up sshing in and copying them to a different network share to pull them down.

palazzo-diagnostics-20240422-2013.zip

April 22

Here are the diagnostics, but it also just ran a check and says everything is good.

image.png.79c65d7f89f0d99ac4a819aadc728390.png

edit: just noticed that was before my last post. Not sure why I didn't notice that before.

palazzo-diagnostics-20240422-1416.zip

April 20

Thank you very much. It said it restored, but I see these warnings. Is this anything to worry about?

image.png.6d3bb81e71ad75436dd5d9891ae3ab60.png image.png.dbb7552241ceae2ea4c65b21d895af04.png

April 18

Well it's back up. Still says disk 5 is disabled.

palazzo-diagnostics-20240418-1325.zip

April 18

I went to connect interact with a docker web application, and noticed it wasn't running. Logged into unRAID to find the array wasn't started. I had rebooting it a day or two ago. I ended up starting the array, and it started a parity check.

I ended up looking at the uptime and it said 1 day 13 hours, which seems weird because that would have been like 6am Monday. I definitely don't remember rebooting it at that time.

It seemed like the parity check started fine, and then all of a sudden it said disk 5 had issues. I see in the notifications it say 3 disk with read errors.

Before I started the array all the disks were showing green on the status. Now disk 5 is showing as disabled.

It wouldn't let me download diagnostics via the ui, I had to generate them via CLI, and transfer them to another NAS device, but they are attached here. The parity check paused. I assume I need to cancel it. Not sure what I need to do at that point. I tried looking through the diagnostics, and noticed several of the disks are missing smart reports.

palazzo-diagnostics-20240417-2000.zip

December 24, 2023

I do have deluge setup 😬. I have thought about swapping out for qbittorrent with vueTorrent, so maybe I'll try doing that over the holidays.

December 23, 2023

So I ended up creating a 2nd cache array, copying everything over to that I could. Most of my containers worked fine. I ended up having to restore the gitlab data from a backup, which I had running weekly. Then the original cache array that was btrfs was re-formatted into zfs. Things seem to be working fine so far.

December 21, 2023

So after I copy everything to a new cache do I just update the shares reference to use the new cache?

Also I got these warnings, but that kind of makes sense since I'm coping from one to the other.

December 21, 2023

What's the best way to copy from the current pool, just targeting `/mnt/cache` and rysnc to a new drive?

December 21, 2023

I ended up letting it run all night, and still shows no errors. I did order some cheap replacement memory sticks that should be here later tonight though.

Would it be possible to just switch out my entire cache drive for another SSD? I have a smaller one sitting around that I could format and throw in. Then try to copy over important things from the old device if it'll let me using unassigned devices? Where is the configuration for the dockers all stored, by that I mean the setup of template configurations, not the appdata, or volume mappings.

December 20, 2023

I do not remember having memory problems in the past (which may mean I personally have them 😁).

Another thing I did recently (about 1 week ago) was upgrade from 6.9.x to 6.12.x. Which makes me wonder if it's something like this guy was seeing,

I have heard of people running memtest for many more hours than I did. Should I let in run all night?

December 20, 2023

Every time I delete some and re-run it appears to get worse.

image.png.5f9169b9f9b07054204c672a0b52d81e.png

December 20, 2023

Oh man. I've been deleting them out of /mnt/user instead of /mnt/cache

December 20, 2023

I was down to just 6, but delete it and ran again, and now it's showing 71. Do I just keep playing whack-a-mole?

image.png.6e1a352a92a8e30e813f01ba97b75024.png

December 20, 2023

You're right, but it says they uncorrectable. Apparently I don't how to read that summary.

image.png.429618f00cd2fe58b76fd050f085e621.png

December 20, 2023

It finished the scrub, and didn't find anything

I did however see this in the syslog. I'm guessing I might just be able to delete that file?

Dec 19 21:59:25 palazzo webGUI: Successful login user root from 10.0.0.127
Dec 19 23:31:36 palazzo monitor: Stop running nchan processes
Dec 20 01:41:50 palazzo kernel: BTRFS warning (device sdk1): csum failed root 5 ino 207444295 off 572157952 csum 0x6d807861 expected csum 0xbda7158d mirror 1
Dec 20 01:41:50 palazzo kernel: BTRFS error (device sdk1): bdev /dev/sdk1 errs: wr 0, rd 0, flush 0, corrupt 56, gen 0
Dec 20 01:41:50 palazzo shfs: copy_file: /mnt/cache/tv/parents/Show Name/Season 01/Show Name.01x05.Episode Name.mkv /mnt/disk5/tv/parents/Show Name/Season 01/Show Name.01x05.Episode Name.mkv.partial (5) Input/output error
Dec 20 01:41:50 palazzo kernel: BTRFS warning (device sdk1): csum failed root 5 ino 207444295 off 572157952 csum 0x6d807861 expected csum 0xbda7158d mirror 1
Dec 20 01:41:50 palazzo kernel: BTRFS error (device sdk1): bdev /dev/sdk1 errs: wr 0, rd 0, flush 0, corrupt 57, gen 0
Dec 20 01:41:50 palazzo kernel: BTRFS warning (device sdk1): csum failed root 5 ino 207444295 off 572157952 csum 0x6d807861 expected csum 0xbda7158d mirror 1
Dec 20 01:41:50 palazzo kernel: BTRFS error (device sdk1): bdev /dev/sdk1 errs: wr 0, rd 0, flush 0, corrupt 58, gen 0
Dec 20 01:41:50 palazzo kernel: BTRFS warning (device sdk1): csum failed root 5 ino 207444295 off 572157952 csum 0x6d807861 expected csum 0xbda7158d mirror 1
Dec 20 01:41:50 palazzo kernel: BTRFS error (device sdk1): bdev /dev/sdk1 errs: wr 0, rd 0, flush 0, corrupt 59, gen 0
Dec 20 01:41:50 palazzo kernel: BTRFS warning (device sdk1): csum failed root 5 ino 207444295 off 572157952 csum 0x6d807861 expected csum 0xbda7158d mirror 1
Dec 20 01:41:50 palazzo kernel: BTRFS error (device sdk1): bdev /dev/sdk1 errs: wr 0, rd 0, flush 0, corrupt 60, gen 0
Dec 20 02:20:03 palazzo kernel: PMS LoudnessCmd[971]: segfault at 0 ip 000014ad5e111080 sp 000014ad595240c8 error 4 in libswresample.so.4[14ad5e109000+18000] likely on CPU 6 (core 2, socket 0)
Dec 20 02:20:03 palazzo kernel: Code: 01 cf 4c 39 c7 72 e3 c3 cc cc 8d 04 49 48 98 4d 89 c1 49 29 c1 48 63 c2 48 63 c9 49 39 f9 76 75 f2 0f 10 05 02 05 ff ff 66 90 <0f> bf 16 0f 57 c9 f2 0f 2a ca f2 0f 59 c8 f2 0f 11 0f 0f bf 14 06
Dec 20 02:21:36 palazzo kernel: PMS LoudnessCmd[4247]: segfault at 0 ip 000014ef176be080 sp 000014ef120790c8 error 4 in libswresample.so.4[14ef176b6000+18000] likely on CPU 7 (core 3, socket 0)
Dec 20 02:21:36 palazzo kernel: Code: 01 cf 4c 39 c7 72 e3 c3 cc cc 8d 04 49 48 98 4d 89 c1 49 29 c1 48 63 c2 48 63 c9 49 39 f9 76 75 f2 0f 10 05 02 05 ff ff 66 90 <0f> bf 16 0f 57 c9 f2 0f 2a ca f2 0f 59 c8 f2 0f 11 0f 0f bf 14 06

December 20, 2023

What do you mean by a scrub of the cache drive? How do I go about that?

December 20, 2023

Attached picture of memtest run. Did 4+ passes with 0 error. I have also rebooted and set network to ipvlan. It's running parity check again, because of unclean shutdown. I'm guessing that's because the errors that were present.

It looks like some of the errors in the syslog had to do with btrfs. The only btrfs drive I have is the cache drive. Is it possible that there is just something corrupt in the docker.img. Would blowing that away, and rebuilding it be of any use? If so is are there detailed instructions for that?

December 19, 2023

My docker just stopped working. I did try adding a new docker using the TRaSH guide. Not sure if something I did in there could have caused the problems. There are some BTFRS errors in the syslog

Dec 18 03:18:05 palazzo kernel: BTRFS error (device loop2): bdev /dev/loop2 errs: wr 3, rd 0, flush 0, corrupt 0, gen 0
Dec 18 03:18:05 palazzo kernel: BTRFS error (device loop2): bdev /dev/loop2 errs: wr 4, rd 0, flush 0, corrupt 0, gen 0

There was similar issue the other day. I restarted it, and it seemed to be working fine. I think I may try restarting, and disabling the 2 new docker containers from the TRaSH guide.

palazzo-diagnostics-20231218-1929.zip palazzo-diagnostics-20231215-1819.zip

November 14, 2023

Everything is back up and running thanks @JorgeB!

November 13, 2023

Everything looked good on emulated disk5, rebuild has started, says about 13hrs to complete!

November 13, 2023

Followed the steps and attaching the new diagnostics

palazzo-diagnostics-20231113-1000.zip

November 13, 2023

Yes disk 3 is still showing as disabled. Disk 3 became disabled on November 7th, earlier in this thread. I haven't really had the array started much since then.

palazzo-diagnostics-20231113-0816.zip

November 13, 2023

The pre-clear has finished, and I was able to find the classic bios mode. Disk 5 is not recognizable by the bios. I found where it showed the disks in the bios, unplugged all the disks and connected disk 5 to P4 on the motherboard. When I booted it back up it didn't see anything except the USB device.

What's the best way to save most of the remaining data? I'm half tempted make the assumption that all the disks are fine except disk 5, and say just rebuild disk 5. Does that seem like a crazy solution? Is that even possible?

image.png.6d441d9e533d1acc41cf4ae8f285cc2f.png

preclear_disk_2EJABBEX_15770.txt

November 10, 2023

I see this over simplified bios screen. Seems like there used to be a much more standard looking bios screen. I'll play around a little more. I'm currently running a pre-clear on that renewed 8TB drive it's 56% through zeroing right now.

openam

Posts

Joined

Last visited

Content Type

Profiles

Forums

Downloads

Store

Gallery

Bug Reports

Documentation

Landing

Posts posted by openam

6.12.6 Multiple drive issues

6.12.6 Multiple drive issues

6.12.6 Multiple drive issues

6.12.6 Multiple drive issues

6.12.6 Multiple drive issues

[solved] unRAID 6.12.6 Docker Service failed to start

[solved] unRAID 6.12.6 Docker Service failed to start

[solved] unRAID 6.12.6 Docker Service failed to start

[solved] unRAID 6.12.6 Docker Service failed to start

[solved] unRAID 6.12.6 Docker Service failed to start

[solved] unRAID 6.12.6 Docker Service failed to start

[solved] unRAID 6.12.6 Docker Service failed to start

[solved] unRAID 6.12.6 Docker Service failed to start

[solved] unRAID 6.12.6 Docker Service failed to start

[solved] unRAID 6.12.6 Docker Service failed to start

[solved] unRAID 6.12.6 Docker Service failed to start

[solved] unRAID 6.12.6 Docker Service failed to start

[solved] unRAID 6.12.6 Docker Service failed to start

[solved] unRAID 6.12.6 Docker Service failed to start

(solved) unRAID 6.9.2 - /var/log is getting full (currently 100 % used)

(solved) unRAID 6.9.2 - /var/log is getting full (currently 100 % used)

(solved) unRAID 6.9.2 - /var/log is getting full (currently 100 % used)

(solved) unRAID 6.9.2 - /var/log is getting full (currently 100 % used)

(solved) unRAID 6.9.2 - /var/log is getting full (currently 100 % used)

(solved) unRAID 6.9.2 - /var/log is getting full (currently 100 % used)