Skip to content
View in the app

A better way to browse. Learn more.

Unraid

A full-screen app on your home screen with push notifications, badges and more.

To install this app on iOS and iPadOS
  1. Tap the Share icon in Safari
  2. Scroll the menu and tap Add to Home Screen.
  3. Tap Add in the top-right corner.
To install this app on Android
  1. Tap the 3-dot menu (⋮) in the top-right corner of the browser.
  2. Tap Add to Home screen or Install app.
  3. Confirm by tapping Install.

Docker image and/or NVME corrupt?

Featured Replies

A couple of days ago my Plex server started to stop responding. A restart would bring it back for roughly 25 hours before it would go down again. I enabled debug logging, but saw nothing out of the ordinary. I decided to delete the docker and recreate it. The docker compose failed with "Failed to create btrfs snapshot: input/output error", leading me to believe this is a corrupt docker image. I deleted/recreated my docker image and re-downloaded all the dockers. After about the 10th image starting up, I started getting  loop2 WRITE  errors again.  I'm in the process of moving things off my NVME cache pool, but I'm at a loss for where the image corruption could be coming from. A short SELF test of each NVME shows 0 errors. The Attributes are not telling me much either. I do have a pre-failing disk 5, which I have the replacement sitting here on my desk. But I cannot see how disk 5 would be related to my docker image corrupting so quickly after recreating it. Am I missing anything? Or could my NVME drive(s) be failing?

unraid-diagnostics-20240202-1308.zip

 

EDIT: I am running the extended tests rn. I also see the zfs pool is showing an error. Anyway to dig into that deeper to know which drive is bad? Or are both having issues?

image.png.aea1cdd8c3e631270ad5eeafa3d71351.png

Edited by UncleStu
typo - addition of zpool status screenshot

Solved by JorgeB

  • UncleStu changed the title to Docker image and/or NVME corrupt?
  • Community Expert

Ryzen with overclocked RAM like you have is known to corrupt data, so I would recommend correcting that and then recreate the pool.

  • Author

Overclocked RAM? I wasn't aware I overclocked anything as I know servers don't care for it. Mind sharing where in the diags that is? Or how you came to see that I have overclocked RAM?

  • Community Expert

Meminfo.txt in the diags, for your config RAM should be set @ 2666MT/s max, it's running @ 3200MT/s.

  • Author

I see the speed set to 3200MT in the meminfo.txt file, but I couldn't find where you saw that it should be 2666. 

  • Author

I see the speed set to 3200MT in the meminfo.txt file, but I couldn't find where you saw that it should be 2666. 

 

EDIT: I changed my RAM speed to auto and verified in the bios that it was 2666. I then erased/re-created both my cache pools. After mover finished moving things back to my nvme cache pool, it still corrupted part way through starting dockers. How can I tell why/what is corrupting this?

unraid-diagnostics-20240204-1904.zip

  • Community Expert
13 hours ago, UncleStu said:

but I couldn't find where you saw that it should be 2666. 

Click the link I've posted above.

  • Author

I did but didn't fully follow it. Either way, my docker image is still corrupting after formatting the cache pools. The extended smart tests showed no errors either.

  • Community Expert

image.png

 

56 minutes ago, UncleStu said:

The extended smart tests showed no errors either.

Very unlikely this is a device problem.

  • Author
1 minute ago, JorgeB said:

Very unlikely this is a device problem.

Assuming 'device' could be as broad as hardware, and as in the nvme devices? How can I go about troubleshooting this more? Nothing has changed in the system for a couple of years. Except now I have the latest BIOS and still have the same issue. Oh, and I did swap out my pre-failing disk 5. The data rebuild is 90% complete. But I can't see how that would be an issue.

 

I have 2 cache pools. NVMe and SSD. When the docker image is on the NVMe pool, along with my appdata, it corrupts. I have been using /mnt/user/system/docker/ as the path for the image. And of course when I move it, I stop the services. Last night, when I recreated the image again, I used /mnt/s-cache/system/docker instead. Putting the image on the SSD cache. The appdata is still at /mnt/user/appdata/ with the appdata share using the cache as preferred and array as backup.

 

I have not had any errors since starting the dockers last night, but my Plex server UI did timeout and required a restart of the docker. This is how it started when I first began to notice corruption. This are the last errors.

Quote

Feb  4 19:05:47 unRAID kernel: BTRFS warning (device loop3): csum failed root 407 ino 7327 off 16207872 csum 0x5780f703 expected csum 0x43ec18ca mirror 1
Feb  4 19:05:47 unRAID kernel: BTRFS error (device loop3): bdev /dev/loop3 errs: wr 0, rd 0, flush 0, corrupt 344, gen 0
Feb  4 19:05:47 unRAID kernel: BTRFS warning (device loop3): csum failed root 407 ino 7327 off 16207872 csum 0x5780f703 expected csum 0x43ec18ca mirror 1
Feb  4 19:05:47 unRAID kernel: BTRFS error (device loop3): bdev /dev/loop3 errs: wr 0, rd 0, flush 0, corrupt 345, gen 0
Feb  4 19:05:48 unRAID kernel: BTRFS warning (device loop3): csum failed root 407 ino 7327 off 16207872 csum 0x5780f703 expected csum 0x43ec18ca mirror 1
Feb  4 19:05:48 unRAID kernel: BTRFS error (device loop3): bdev /dev/loop3 errs: wr 0, rd 0, flush 0, corrupt 346, gen 0
Feb  4 19:05:48 unRAID kernel: BTRFS warning (device loop3): csum failed root 407 ino 7327 off 16207872 csum 0x5780f703 expected csum 0x43ec18ca mirror 1
Feb  4 19:05:48 unRAID kernel: BTRFS error (device loop3): bdev /dev/loop3 errs: wr 0, rd 0, flush 0, corrupt 347, gen 0

 

  • Community Expert
  • Solution
6 minutes ago, UncleStu said:

Assuming 'device' could be as broad as hardware, and as in the nvme devices?

Device here means NVMe device(s).

 

I would start by running memtest.

 

 

  • Author
8 hours ago, JorgeB said:

I would start by running memtest.

memtest found 2 of the 4 sticks had errors. Pulled those for now and started a RMA request with G.Skill. Memtest passed on the two sticks I left in too.

 

I moved my system share and appdata into the same nvme pool. Everything started up with no issues. Historically the docker image would throw errors when on the nvme pool. No errors on the SSD pool or the array. I also added the 'nvme_core.default...' to my default boot, from this post. 

 

Time will tell at this point. Thank you @JorgeB for your assistance.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

Account

Navigation

Search

Search

Configure browser push notifications

Chrome (Android)
  1. Tap the lock icon next to the address bar.
  2. Tap Permissions → Notifications.
  3. Adjust your preference.
Chrome (Desktop)
  1. Click the padlock icon in the address bar.
  2. Select Site settings.
  3. Find Notifications and adjust your preference.