BTRFS error and Read-only cache since updating to 6.12


Recommended Posts

Hello all.

 

Used Unraid since nearly 2 years, it was working quite well. No crash.

Since the release of 6.12 and 6.12.1, my cache randomly crash with BTRFS error and switch to read-only.

It's always the same, it starts with

 

Jun 25 14:15:33 Becky kernel: BTRFS error (device nvme0n1p1): block=762904576 write time tree block corruption detected
Jun 25 14:15:33 Becky kernel: BTRFS: error (device nvme0n1p1) in btrfs_commit_transaction:2460: errno=-5 IO failure (Error while writing out transaction)
Jun 25 14:15:33 Becky kernel: BTRFS info (device nvme0n1p1: state E): forced readonly
Jun 25 14:15:33 Becky kernel: BTRFS warning (device nvme0n1p1: state E): Skipping commit of aborted transaction.
Jun 25 14:15:33 Becky kernel: BTRFS: error (device nvme0n1p1: state EA) in cleanup_transaction:1958: errno=-5 IO failure

 

No need to reboot hardware, just stopping the array and restarting switch my cache back online.

 

Did 2 different memtest (Passmark's and Memtest86+), the hardware is stable. Temperatures are okay, CPU never go above 75°C. Even switched SSD, from a SATA to an NVMe. Same errors.

 

Do anyone has an idea ? Im a really thinking to switch my cache from BTRFS to xfs.

becky-diagnostics-20230625-1604.zip

Edited by Fl4v1en
Link to comment
  • 3 weeks later...

I am having the same symptoms after upgrading from 6.11.5 to 6.12.2. btrfs mirror of 2 NVMe drives. I reformatted it once after the upgrade to 6.12.2 and it triggered again yesterday sometime. Docker and VMs are running on a separate btrfs mirror (/mnt/user/virtualization). /mnt/user/cache is usually what I've noticed go read-only - most of what's going through that is system backups (Time Machine, AOMEI Backupper and Windows Backup). I attached diags in case there's something similar. These issues only started after the upgrade, no other changes to hardware have been made. 

 

 

phoenix-diagnostics-20230717-0759.zip

Link to comment

I too am having this problem.  System was running fine until the upgrade to 6.12 and still happening after 6.12.3.  I have two nvme pools; cache and VMs.  Didn't have any BTRFS errors with the VM's NVME drive so I swapped the two and now the newly swapped VMs drive for the cache drive is receiving the same BTRFS errors.  The old cache nvme drive now being used for my VMs hasnt had a single BTRFS error since the swap.  I've successfully cleared the errors by doing a scrub and finding the corrupt data but anytime a new file is downloaded from nzbget i get same BTRFS error and have to locate the corrupt data to clear the errors.  This is a daily thing now.  If I dont clear the errors daily, eventually the corrupt data get to be too much and docker containers begin to fail and I have to move all data on cache pool onto the array, reformat drive, and move data back onto cache pool.  Pretty frustrating to say the least.  

 

 

Jul 25 02:05:12 Unraid kernel: BTRFS error (device nvme1n1p1): bdev /dev/nvme1n1p1 errs: wr 0, rd 0, flush 0, corrupt 81, gen 0
Jul 25 02:05:12 Unraid kernel: BTRFS warning (device nvme1n1p1): csum failed root 5 ino 855986 off 24414867456 csum 0x329e22d6 expected csum 0x43aeae62 mirror 1
Jul 25 02:05:12 Unraid kernel: BTRFS error (device nvme1n1p1): bdev /dev/nvme1n1p1 errs: wr 0, rd 0, flush 0, corrupt 82, gen 0
Jul 25 02:05:12 Unraid kernel: BTRFS warning (device nvme1n1p1): csum failed root 5 ino 855986 off 24414867456 csum 0x329e22d6 expected csum 0x43aeae62 mirror 1

 

Edit:  Added Diag files

unraid-diagnostics-20230731-0722.zip

Edited by Arron
Link to comment
On 6/25/2023 at 4:21 PM, Fl4v1en said:

Hello all.

 

Used Unraid since nearly 2 years, it was working quite well. No crash.

Since the release of 6.12 and 6.12.1, my cache randomly crash with BTRFS error and switch to read-only.

It's always the same, it starts with

 

Jun 25 14:15:33 Becky kernel: BTRFS error (device nvme0n1p1): block=762904576 write time tree block corruption detected
Jun 25 14:15:33 Becky kernel: BTRFS: error (device nvme0n1p1) in btrfs_commit_transaction:2460: errno=-5 IO failure (Error while writing out transaction)
Jun 25 14:15:33 Becky kernel: BTRFS info (device nvme0n1p1: state E): forced readonly
Jun 25 14:15:33 Becky kernel: BTRFS warning (device nvme0n1p1: state E): Skipping commit of aborted transaction.
Jun 25 14:15:33 Becky kernel: BTRFS: error (device nvme0n1p1: state EA) in cleanup_transaction:1958: errno=-5 IO failure

 

No need to reboot hardware, just stopping the array and restarting switch my cache back online.

 

Did 2 different memtest (Passmark's and Memtest86+), the hardware is stable. Temperatures are okay, CPU never go above 75°C. Even switched SSD, from a SATA to an NVMe. Same errors.

 

Do anyone has an idea ? Im a really thinking to switch my cache from BTRFS to xfs.

becky-diagnostics-20230625-1604.zip 106.47 kB · 0 downloads

I have exactly the same problem, with the same error pattern.
The error exists since the upgrade from v6.11.5 to v6.12.3

I never had any problems with version 6.11.5.

I'll try it now also once with the change from Btrfs vdisk to Directory

Link to comment

I'm getting the same since upgrading from 6.11.something to 6.12.3 - cache disk going read only after a period of time.  Server had been rock-solid and running without issue or reboot for nearly 3 months before the upgrade.  Now it's died with 5 or 6 separate issues over the last couple of days.  I'm currently (unrelated, it seems) rebuilding the array disk and the cache has gone read only again.  Once the array rebuilds then I'll have to sort the cache again, but the server is unusable in this state because it means that all the Docker containers go offline every day or so.  Diagnostics attached, thanks.

tower-diagnostics-20230731-2309.zip

Link to comment

I dont get it. The only thing i see there is:

Quote

If you revert back from 6.12 to 6.11.5 or earlier, you have to force update all your Docker containers and start them manually after downgrading. This is necessary because of the underlying change to cgroup v2 starting with 6.12.0-rc1.

 

If i do this:

Quote


TOTAL DATA PULLED: 0 B

 

Removing container: OnlyOfficeDocumentServer

Error: Server error

 

Command executiondocker create
  --name='OnlyOfficeDocumentServer'
  --net='br0'
  --ip='192.168.50.252'
  -e TZ="Europe/Berlin"
  -e HOST_OS="Unraid"
  -e HOST_HOSTNAME="NAS"
  -e HOST_CONTAINERNAME="OnlyOfficeDocumentServer"
  -e 'TCP_PORT_80'='80'
  -e 'TCP_PORT_443'='443'
  -e 'JWT_SECRET'=''
  -l net.unraid.docker.managed=dockerman
  -l net.unraid.docker.webui='http://[IP]:[PORT:80]'
  -l net.unraid.docker.icon='https://raw.githubusercontent.com/SiwatINC/unraid-ca-repository/master/icons/onlyoffice.png'
  -v '/mnt/user/appdata/onlyofficeds/logs':'/var/log/onlyoffice':'rw'
  -v '/mnt/user/appdata/onlyofficeds/Data':'/var/www/onlyoffice/Data':'rw'
  -v '/mnt/user/appdata/onlyofficeds/fonts':'/usr/share/fonts':'rw' 'onlyoffice/documentserver'

Error response from daemon: Conflict. The container name "/OnlyOfficeDocumentServer" is already in use by container "18f2c1ef672d6222dfe1362d0ea4702ad39448b70d8789f1554cc11c011d19c8". You have to remove (or rename) that container to be able to reuse that name.

The command failed.

 

Edited by TimTaylor
Link to comment
  • 2 weeks later...
  • 2 weeks later...
On 8/3/2023 at 11:03 PM, mhyclak said:

Reverting to 6.11.5 has resolved the issues for me, so I suspect it must be something with the kernel or btrfs versions in 6.12. 


I've made an account just to reply and say thank you very much to those suggesting the 6.11.5 rollback, as it has fixed my issues. I was experiencing the same btrfs errors discussed in this thread.

Link to comment
  • 2 weeks later...
  • 2 weeks later...
  • 2 months later...

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.