Skip to content
View in the app

A better way to browse. Learn more.

Unraid

A full-screen app on your home screen with push notifications, badges and more.

To install this app on iOS and iPadOS
  1. Tap the Share icon in Safari
  2. Scroll the menu and tap Add to Home Screen.
  3. Tap Add in the top-right corner.
To install this app on Android
  1. Tap the 3-dot menu (⋮) in the top-right corner of the browser.
  2. Tap Add to Home screen or Install app.
  3. Confirm by tapping Install.

Docker.img corruption (again)

Featured Replies

I have been getting frequent occurrences of my docker.img file becoming corrupt.

No other drive or filesystem errors.  SMART tests do not report error.

 

I don't know if the constant problem is my cache drive, docker, Unraid, or ZFS. 

 

My guess:

My cache drive is ZFS.  The docker file is BTRFS vDisk.  Maybe there's a conflict?

 

# zpool status -xv
  pool: cache
 state: ONLINE
status: One or more devices has experienced an error resulting in data
        corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
        entire pool from backup.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-8A
  scan: scrub repaired 0B in 00:15:38 with 3 errors on Sun Aug 27 12:39:33 2023
config:

        NAME        STATE     READ WRITE CKSUM
        cache       ONLINE       0     0     0
          sdi1      ONLINE       0     0   128

errors: Permanent errors have been detected in the following files:

        /mnt/cache/system/docker/docker.img

 

 

tower-diagnostics-20230827-1425.zip

  • Community Expert
17 minutes ago, Jaybau said:

My cache drive is ZFS.  The docker file is BTRFS vDisk.  Maybe there's a conflict?

That's fine, I would start by running memtest.

  • Author
6 hours ago, JorgeB said:

That's fine, I would start by running memtest.

 

memtest = 1 pass, 0 errors.

Edited by Jaybau

  • 1 year later...

Did you ever solve this? I have the same issue happening.

memtest passed multiple times, checked monthly, although I have only been running unraid for one month, I have ran it at least 3 times and all passed.

 

root@NAS2:~# zpool status cache -v
  pool: cache
 state: ONLINE
  scan: scrub repaired 0B in 00:00:29 with 1 errors on Tue Dec 31 10:16:17 2024
config:
        NAME           STATE     READ WRITE CKSUM
        cache          ONLINE       0     0     0
          mirror-0     ONLINE       0     0     0
            nvme0n1p1  ONLINE       0     0     4
            nvme1n1p1  ONLINE       0     0     4
errors: Permanent errors have been detected in the following files:
        cache/domains:<0x4>
        
root@NAS2:~# zpool status cache -v
  pool: cache
 state: ONLINE
  scan: scrub repaired 0B in 00:00:29 with 1 errors on Tue Dec 31 10:16:17 2024
config:
        NAME           STATE     READ WRITE CKSUM
        cache          ONLINE       0     0     0
          mirror-0     ONLINE       0     0     0
            nvme0n1p1  ONLINE       0     0     7
            nvme1n1p1  ONLINE       0     0     7
errors: Permanent errors have been detected in the following files:
        /mnt/cache/system/docker/docker.img
        cache/domains:<0x4>

 

As you can seen docker.img pops up inbetween two command runs for some reason, without any change in docker containers

The 0x4 shows because I deleted the file earlier today as that file was irrelevant, one one but now same is happening for docker.img

I am having similar issues on other drives as well, all zfs/btrfs are having corruption.

 

16TBx2 MIRROR Seagate Exos x18

root@NAS2:~# zpool status data -v
  pool: data
 state: ONLINE
  scan: scrub in progress since Tue Dec 31 10:00:33 2024
        2.21T / 4.78T scanned at 504M/s, 972G / 4.78T issued at 216M/s
        256K repaired, 19.87% done, 05:09:15 to go
config:

        NAME        STATE     READ WRITE CKSUM
        data        ONLINE       0     0     0
          mirror-0  ONLINE       0     0     0
            sdc1    ONLINE       0     0     7  (repairing)
            sde1    ONLINE       0     0     7  (repairing)

errors: Permanent errors have been detected in the following files:

        /mnt/data/Backup/2024-12-23.zip
        /mnt/data/PlexData/Series/...1080p.mkv

 

8TBx2 MIRROR IronWolf

UUID:             9920b33f-88eb-49cf-a2d7-784bb99eaab9
Scrub started:    Mon Dec 30 03:03:45 2024
Status:           finished
Duration:         16:03:12
Total to scrub:   13.85TiB
Rate:             251.06MiB/s
Error summary:    csum=2
  Corrected:      0
  Uncorrectable:  2
  Unverified:     0


It shouldn't be related to the ssd/drives, as they were running fine without any issues in another system.

All drive extended smart tests are all passing, HDDScan ERASE passed, newly formatted afterwards

Running 7.0.0-rc.2

  • Community Expert
5 hours ago, hhhhh said:

memtest passed multiple times,

memtest is only definitive if it finds errors, if you have multiple sticks try using the server with just one, if the same try with a different one, that will basically rule out bad RAM which is the #1 reason for data corrutpion.

This morning the server was unresponsive on the web UI. I tried to login locally, which was very laggy and ended up with "login: timed out after 60 seconds" after entering the username, exact behavior as mentioned in following post:

Because of these stability issues and recent checksum issues I ran a memtest after a reboot which failed when using 2*16GB, tried either RAM stick 1x16GB in either slot when running memtest, both slots/sticks failed. Both memory sticks started failing somehow in the last month. Strange as this memory was working fine in the same system running windows prior to using unraid. I performed a memtest after deciding to go with ZFS, which passed mid december. This system ran fine for 2 years. Either way I purchased a new 1x32GB and no memtest errors. I will keep monitoring, and hope faulty memory caused the issues I experienced.

 

I did notice the CPU was getting quite hot due to some scheduled maintenance/backup tasks between 2 to 3:25 AM, which I believe is when unraid got stuck, HA (running on unraid) did keep collecting data from the external power monitor, but no longer from unraid, until I suppose docker even got stuck, ending any data stored after 6 AM. 

 

I've changed my fan settings to keep the CPU temp lower under full load, and also improved ventilation so that the memory / cache drives run less hot. 

 

Hoping these changes will solve the issues with system instability

 

 

 

image.png

Edited by hhhhh

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

Account

Navigation

Search

Search

Configure browser push notifications

Chrome (Android)
  1. Tap the lock icon next to the address bar.
  2. Tap Permissions → Notifications.
  3. Adjust your preference.
Chrome (Desktop)
  1. Click the padlock icon in the address bar.
  2. Select Site settings.
  3. Find Notifications and adjust your preference.