Skip to content
View in the app

A better way to browse. Learn more.

Unraid

A full-screen app on your home screen with push notifications, badges and more.

To install this app on iOS and iPadOS
  1. Tap the Share icon in Safari
  2. Scroll the menu and tap Add to Home Screen.
  3. Tap Add in the top-right corner.
To install this app on Android
  1. Tap the 3-dot menu (⋮) in the top-right corner of the browser.
  2. Tap Add to Home screen or Install app.
  3. Confirm by tapping Install.

/var/log is getting full - BTRFS error (device nvme1n1p1)

Featured Replies

Hello, 

 

I am having a CACHE POOL problem and the great Fix Common Problems plugin found an error that my /var/log/syslog was filling up. 

 

The 2 nvme drives are a few weeks old.  When I went to replace my old cache pool (2 SSDs), the replace one at a time method did not work and I ended up wiping my cache pool and starting fresh.  Now this error surfaced from what I can tell on 11/19/2022.

 

I found this post but it does not show what to do if its not a cabling problem (no cables for NVMe).

 

here's the results of btrfs scrub status /mnt/cache:

UUID:             92b897fd-c2ab-43b4-8adc-7c53792bcd7a
        no stats available
Total to scrub:   280.83GiB
Rate:             0.00B/s
Error summary:    no errors found

 

here's the results of btrfs dev stats -z /mnt/cache:

[/dev/nvme1n1p1].write_io_errs    0
[/dev/nvme1n1p1].read_io_errs     0
[/dev/nvme1n1p1].flush_io_errs    0
[/dev/nvme1n1p1].corruption_errs  0
[/dev/nvme1n1p1].generation_errs  0
[/dev/nvme2n2p1].write_io_errs    35301385
[/dev/nvme2n2p1].read_io_errs     2140829
[/dev/nvme2n2p1].flush_io_errs    1644451
[/dev/nvme2n2p1].corruption_errs  0
[/dev/nvme2n2p1].generation_errs  0

 

Last few lines of syslog:

Nov 21 21:20:16 freddie kernel: btrfs_dev_stat_print_on_error: 42 callbacks suppressed
Nov 21 21:20:16 freddie kernel: BTRFS error (device nvme1n1p1): bdev /dev/nvme2n2p1 errs: wr 35250544, rd 2106592, flush 1643107, corrupt 0, gen 0
Nov 21 21:20:16 freddie kernel: BTRFS error (device nvme1n1p1): bdev /dev/nvme2n2p1 errs: wr 35250545, rd 2106592, flush 1643107, corrupt 0, gen 0
Nov 21 21:20:16 freddie kernel: BTRFS error (device nvme1n1p1): bdev /dev/nvme2n2p1 errs: wr 35250545, rd 2106593, flush 1643107, corrupt 0, gen 0
Nov 21 21:20:16 freddie kernel: BTRFS error (device nvme1n1p1): bdev /dev/nvme2n2p1 errs: wr 35250546, rd 2106593, flush 1643107, corrupt 0, gen 0
Nov 21 21:20:16 freddie kernel: BTRFS error (device nvme1n1p1): bdev /dev/nvme2n2p1 errs: wr 35250547, rd 2106593, flush 1643107, corrupt 0, gen 0
Nov 21 21:20:16 freddie kernel: BTRFS error (device nvme1n1p1): bdev /dev/nvme2n2p1 errs: wr 35250548, rd 2106593, flush 1643107, corrupt 0, gen 0
Nov 21 21:20:16 freddie kernel: BTRFS error (device nvme1n1p1): bdev /dev/nvme2n2p1 errs: wr 35250549, rd 2106593, flush 1643107, corrupt 0, gen 0
Nov 21 21:20:16 freddie kernel: BTRFS error (device nvme1n1p1): bdev /dev/nvme2n2p1 errs: wr 35250550, rd 2106593, flush 1643107, corrupt 0, gen 0
Nov 21 21:20:16 freddie kernel: BTRFS error (device nvme1n1p1): bdev /dev/nvme2n2p1 errs: wr 35250551, rd 2106593, flush 1643107, corrupt 0, gen 0
Nov 21 21:20:16 freddie kernel: BTRFS error (device nvme1n1p1): bdev /dev/nvme2n2p1 errs: wr 35250551, rd 2106594, flush 1643107, corrupt 0, gen 0
Nov 21 21:20:17 freddie kernel: BTRFS warning (device nvme1n1p1): lost page write due to IO error on /dev/nvme2n2p1 (-5)
Nov 21 21:20:17 freddie kernel: BTRFS error (device nvme1n1p1): error writing primary super block to device 2

 

unraid-diagnostics-20221121-2118.zip

Solved by JorgeB

  • Author

I rebooted my server and when it came online, I lost my docker.img file and my /mnt/cache/system/libvirt/libvirt.img file.

 

The system said that the Docker service could not start and the VMs service could not start.

 

I zeroed the errors on the pool using btrfs dev stats -c /mnt/cache.

 

When I deleted the docker.img file and recreated it the corruption_errs value started climbing from 0.  After I recovered my libvirt.img file and started the VMs again the corruption_errs continues to climb.

 

[/dev/nvme1n1p1].write_io_errs    0
[/dev/nvme1n1p1].read_io_errs     0
[/dev/nvme1n1p1].flush_io_errs    0
[/dev/nvme1n1p1].corruption_errs  0
[/dev/nvme1n1p1].generation_errs  0
[/dev/nvme0n1p1].write_io_errs    0
[/dev/nvme0n1p1].read_io_errs     0
[/dev/nvme0n1p1].flush_io_errs    0
[/dev/nvme0n1p1].corruption_errs  103148
[/dev/nvme0n1p1].generation_errs  0

 

unraid-diagnostics-20221121-2238.zip

  • Community Expert
  • Solution

One of the devices dropped offline, you should run a scrub to bring it up to date, corruption errors are normal in this case for every synced block, you can re-set them when done.

 

Make sure system share is set to COW, old default was NOCOW, and that cannot be corrected.

 

The below might help with the dropping device.

 

On the main GUI page click on the flash drive, scroll down to "Syslinux Configuration", make sure it's set to "menu view" (top right) and add this to your default boot option, after "append initrd=/bzroot"

nvme_core.default_ps_max_latency_us=0 pcie_aspm=off

e.g.:

append initrd=/bzroot nvme_core.default_ps_max_latency_us=0 pcie_aspm=off


Reboot and see if it helps.

  • Author

Hi @JorgeB

 

Thank you, before today I was aware of the utilities under "Cache Settings" to rebalance, scrub, etc.  I will have to research a bit to see when to use them.  All schedules for balance and scrub are disabled.

 

Regarding System Share, I have updated the "Enable Copy-on-write" setting to AUTO, it was on "NO".  What do you mean this cannot be corrected? Also, do you know what the recommendation is for "Use cache pool" for the System Share?  Should it be "ONLY"?

 

Under "Syslinux Configuration" this is my new setting.  I believe I added "" for GPU passthrough some time ago.

 

unRAID OS Label (Syslinux Configuration)

kernel /bzimage
append pcie_acs_override=multifunction initrd=/bzroot nvme_core.default_ps_max_latency_us=0 pcie_aspm=off

 

Thank you I will be rebooting shortly (waiting for Parity-Check to finish)

  • Community Expert
14 minutes ago, MickMorley said:

What do you mean this cannot be corrected?

NOCOW disables checksums, so with raid1 if one of the devices drops offline and then comes back online btrfs has no way of knowing which device has the latest and correct data, and it will just read form both alternately, and since the dropped device has wrong data it can result in data corrutpion, e.g:

 

8 hours ago, MickMorley said:

I lost my docker.img file and my /mnt/cache/system/libvirt/libvirt.img file.

 

  • Author

Hi @JorgeB, I appreciate the explanations!

 

So far everything is working normally.  I recreated all of my dockers using the Previous Apps feature and selecting all at once.  I had a backup of libvirt,img file.

 

I put in your recommendationsa and all is good.  Syslog looks OK.

 

A btrfs dev stats -c /mnt/cache renders no errors:

 

[/dev/nvme1n1p1].write_io_errs    0
[/dev/nvme1n1p1].read_io_errs     0
[/dev/nvme1n1p1].flush_io_errs    0
[/dev/nvme1n1p1].corruption_errs  0
[/dev/nvme1n1p1].generation_errs  0
[/dev/nvme0n1p1].write_io_errs    0
[/dev/nvme0n1p1].read_io_errs     0
[/dev/nvme0n1p1].flush_io_errs    0
[/dev/nvme0n1p1].corruption_errs  0
[/dev/nvme0n1p1].generation_errs  0

 

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

Account

Navigation

Search

Search

Configure browser push notifications

Chrome (Android)
  1. Tap the lock icon next to the address bar.
  2. Tap Permissions → Notifications.
  3. Adjust your preference.
Chrome (Desktop)
  1. Click the padlock icon in the address bar.
  2. Select Site settings.
  3. Find Notifications and adjust your preference.