Fl4v1en Posted June 25, 2023 Share Posted June 25, 2023 (edited) Hello all. Used Unraid since nearly 2 years, it was working quite well. No crash. Since the release of 6.12 and 6.12.1, my cache randomly crash with BTRFS error and switch to read-only. It's always the same, it starts with Jun 25 14:15:33 Becky kernel: BTRFS error (device nvme0n1p1): block=762904576 write time tree block corruption detected Jun 25 14:15:33 Becky kernel: BTRFS: error (device nvme0n1p1) in btrfs_commit_transaction:2460: errno=-5 IO failure (Error while writing out transaction) Jun 25 14:15:33 Becky kernel: BTRFS info (device nvme0n1p1: state E): forced readonly Jun 25 14:15:33 Becky kernel: BTRFS warning (device nvme0n1p1: state E): Skipping commit of aborted transaction. Jun 25 14:15:33 Becky kernel: BTRFS: error (device nvme0n1p1: state EA) in cleanup_transaction:1958: errno=-5 IO failure No need to reboot hardware, just stopping the array and restarting switch my cache back online. Did 2 different memtest (Passmark's and Memtest86+), the hardware is stable. Temperatures are okay, CPU never go above 75°C. Even switched SSD, from a SATA to an NVMe. Same errors. Do anyone has an idea ? Im a really thinking to switch my cache from BTRFS to xfs. becky-diagnostics-20230625-1604.zip Edited June 25, 2023 by Fl4v1en Quote Link to comment
Fl4v1en Posted June 25, 2023 Author Share Posted June 25, 2023 For now, I tried switching my dockers from Btrfs vdisk to Directory. Quote Link to comment
JorgeB Posted June 26, 2023 Share Posted June 26, 2023 17 hours ago, Fl4v1en said: write time tree block corruption detected This usually means bad RAM or other kernel memory corruption, you could try redoing the pool or using zfs to see if it's more stable. Quote Link to comment
mhyclak Posted July 17, 2023 Share Posted July 17, 2023 I am having the same symptoms after upgrading from 6.11.5 to 6.12.2. btrfs mirror of 2 NVMe drives. I reformatted it once after the upgrade to 6.12.2 and it triggered again yesterday sometime. Docker and VMs are running on a separate btrfs mirror (/mnt/user/virtualization). /mnt/user/cache is usually what I've noticed go read-only - most of what's going through that is system backups (Time Machine, AOMEI Backupper and Windows Backup). I attached diags in case there's something similar. These issues only started after the upgrade, no other changes to hardware have been made. phoenix-diagnostics-20230717-0759.zip Quote Link to comment
JorgeB Posted July 17, 2023 Share Posted July 17, 2023 2 hours ago, mhyclak said: I am having the same symptoms Btrfs is detecting checksum errors so fist thing is to run memtest. Quote Link to comment
Arron Posted July 25, 2023 Share Posted July 25, 2023 (edited) I too am having this problem. System was running fine until the upgrade to 6.12 and still happening after 6.12.3. I have two nvme pools; cache and VMs. Didn't have any BTRFS errors with the VM's NVME drive so I swapped the two and now the newly swapped VMs drive for the cache drive is receiving the same BTRFS errors. The old cache nvme drive now being used for my VMs hasnt had a single BTRFS error since the swap. I've successfully cleared the errors by doing a scrub and finding the corrupt data but anytime a new file is downloaded from nzbget i get same BTRFS error and have to locate the corrupt data to clear the errors. This is a daily thing now. If I dont clear the errors daily, eventually the corrupt data get to be too much and docker containers begin to fail and I have to move all data on cache pool onto the array, reformat drive, and move data back onto cache pool. Pretty frustrating to say the least. Jul 25 02:05:12 Unraid kernel: BTRFS error (device nvme1n1p1): bdev /dev/nvme1n1p1 errs: wr 0, rd 0, flush 0, corrupt 81, gen 0 Jul 25 02:05:12 Unraid kernel: BTRFS warning (device nvme1n1p1): csum failed root 5 ino 855986 off 24414867456 csum 0x329e22d6 expected csum 0x43aeae62 mirror 1 Jul 25 02:05:12 Unraid kernel: BTRFS error (device nvme1n1p1): bdev /dev/nvme1n1p1 errs: wr 0, rd 0, flush 0, corrupt 82, gen 0 Jul 25 02:05:12 Unraid kernel: BTRFS warning (device nvme1n1p1): csum failed root 5 ino 855986 off 24414867456 csum 0x329e22d6 expected csum 0x43aeae62 mirror 1 Edit: Added Diag files unraid-diagnostics-20230731-0722.zip Edited July 31, 2023 by Arron Quote Link to comment
DOM_EU Posted July 25, 2023 Share Posted July 25, 2023 On 6/25/2023 at 4:21 PM, Fl4v1en said: Hello all. Used Unraid since nearly 2 years, it was working quite well. No crash. Since the release of 6.12 and 6.12.1, my cache randomly crash with BTRFS error and switch to read-only. It's always the same, it starts with Jun 25 14:15:33 Becky kernel: BTRFS error (device nvme0n1p1): block=762904576 write time tree block corruption detected Jun 25 14:15:33 Becky kernel: BTRFS: error (device nvme0n1p1) in btrfs_commit_transaction:2460: errno=-5 IO failure (Error while writing out transaction) Jun 25 14:15:33 Becky kernel: BTRFS info (device nvme0n1p1: state E): forced readonly Jun 25 14:15:33 Becky kernel: BTRFS warning (device nvme0n1p1: state E): Skipping commit of aborted transaction. Jun 25 14:15:33 Becky kernel: BTRFS: error (device nvme0n1p1: state EA) in cleanup_transaction:1958: errno=-5 IO failure No need to reboot hardware, just stopping the array and restarting switch my cache back online. Did 2 different memtest (Passmark's and Memtest86+), the hardware is stable. Temperatures are okay, CPU never go above 75°C. Even switched SSD, from a SATA to an NVMe. Same errors. Do anyone has an idea ? Im a really thinking to switch my cache from BTRFS to xfs. becky-diagnostics-20230625-1604.zip 106.47 kB · 0 downloads I have exactly the same problem, with the same error pattern. The error exists since the upgrade from v6.11.5 to v6.12.3 I never had any problems with version 6.11.5. I'll try it now also once with the change from Btrfs vdisk to Directory Quote Link to comment
JorgeB Posted July 31, 2023 Share Posted July 31, 2023 Please post the diagnostics. Quote Link to comment
turnma Posted July 31, 2023 Share Posted July 31, 2023 I'm getting the same since upgrading from 6.11.something to 6.12.3 - cache disk going read only after a period of time. Server had been rock-solid and running without issue or reboot for nearly 3 months before the upgrade. Now it's died with 5 or 6 separate issues over the last couple of days. I'm currently (unrelated, it seems) rebuilding the array disk and the cache has gone read only again. Once the array rebuilds then I'll have to sort the cache again, but the server is unusable in this state because it means that all the Docker containers go offline every day or so. Diagnostics attached, thanks. tower-diagnostics-20230731-2309.zip Quote Link to comment
JorgeB Posted August 1, 2023 Share Posted August 1, 2023 9 hours ago, turnma said: cache disk going read only after a period of time. You can downgrade back to v6.11.5 to see if the issue stops, in case it's a kernel/btrfs bug, if it remains run memtest. Quote Link to comment
turnma Posted August 1, 2023 Share Posted August 1, 2023 Thanks. Downgraded now, so I’ll see pretty quickly if it helps. It’s been a steady stream of issues since the upgrade, so if I get through 24 hours without issues then that will be a positive sign. Quote Link to comment
TimTaylor Posted August 2, 2023 Share Posted August 2, 2023 Same problem with 6.12.2 and 6.12.3 ... my nas is now broken cause i cant get back to 6.11 Quote Link to comment
JorgeB Posted August 2, 2023 Share Posted August 2, 2023 14 minutes ago, TimTaylor said: my nas is now broken cause i cant get back to 6.11 why not? Quote Link to comment
TimTaylor Posted August 2, 2023 Share Posted August 2, 2023 Actually i did revert do 6.11.5, now dockers are not starting anymore Quote Execution error Server error Im getting crazy with this Quote Link to comment
JorgeB Posted August 2, 2023 Share Posted August 2, 2023 25 minutes ago, TimTaylor said: now dockers are not starting anymore See the release notes, there's a procedure you must do after downgrading. Quote Link to comment
TimTaylor Posted August 2, 2023 Share Posted August 2, 2023 (edited) I dont get it. The only thing i see there is: Quote If you revert back from 6.12 to 6.11.5 or earlier, you have to force update all your Docker containers and start them manually after downgrading. This is necessary because of the underlying change to cgroup v2 starting with 6.12.0-rc1. If i do this: Quote TOTAL DATA PULLED: 0 B Removing container: OnlyOfficeDocumentServer Error: Server error Command executiondocker create --name='OnlyOfficeDocumentServer' --net='br0' --ip='192.168.50.252' -e TZ="Europe/Berlin" -e HOST_OS="Unraid" -e HOST_HOSTNAME="NAS" -e HOST_CONTAINERNAME="OnlyOfficeDocumentServer" -e 'TCP_PORT_80'='80' -e 'TCP_PORT_443'='443' -e 'JWT_SECRET'='' -l net.unraid.docker.managed=dockerman -l net.unraid.docker.webui='http://[IP]:[PORT:80]' -l net.unraid.docker.icon='https://raw.githubusercontent.com/SiwatINC/unraid-ca-repository/master/icons/onlyoffice.png' -v '/mnt/user/appdata/onlyofficeds/logs':'/var/log/onlyoffice':'rw' -v '/mnt/user/appdata/onlyofficeds/Data':'/var/www/onlyoffice/Data':'rw' -v '/mnt/user/appdata/onlyofficeds/fonts':'/usr/share/fonts':'rw' 'onlyoffice/documentserver' Error response from daemon: Conflict. The container name "/OnlyOfficeDocumentServer" is already in use by container "18f2c1ef672d6222dfe1362d0ea4702ad39448b70d8789f1554cc11c011d19c8". You have to remove (or rename) that container to be able to reuse that name. The command failed. Edited August 2, 2023 by TimTaylor Quote Link to comment
mhyclak Posted August 3, 2023 Share Posted August 3, 2023 On 7/17/2023 at 10:33 AM, JorgeB said: Btrfs is detecting checksum errors so fist thing is to run memtest. Reverting to 6.11.5 has resolved the issues for me, so I suspect it must be something with the kernel or btrfs versions in 6.12. Quote Link to comment
JorgeB Posted August 3, 2023 Share Posted August 3, 2023 It's a possibility since there have been more cases than usual, though for me it's been working fine. Quote Link to comment
local.bin Posted August 3, 2023 Share Posted August 3, 2023 I have the same issue since upgrading to 6.12. I have removed the dockers, deleted docker img file and recreated all the dockers, in the hope it solves the problem. At least its running through as the other 6.12 server is now borked. Quote Link to comment
turnma Posted August 13, 2023 Share Posted August 13, 2023 Just wanted to report back that since I downgraded (on the 1st) it's been rock solid again. Slightly nervous now about what this means for the future, because obviously this means that upgrades are effectively out of the question until the bugs are fixed. Quote Link to comment
JorgeB Posted August 14, 2023 Share Posted August 14, 2023 If it's a kernel issue, and it may be but still a corner case, since most users are not affected, it should be fixed in an upcoming release, try v6.13 once available which will include a much newer kernel. 1 Quote Link to comment
Shadowfita Posted August 26, 2023 Share Posted August 26, 2023 On 8/3/2023 at 11:03 PM, mhyclak said: Reverting to 6.11.5 has resolved the issues for me, so I suspect it must be something with the kernel or btrfs versions in 6.12. I've made an account just to reply and say thank you very much to those suggesting the 6.11.5 rollback, as it has fixed my issues. I was experiencing the same btrfs errors discussed in this thread. Quote Link to comment
wayner Posted September 5, 2023 Share Posted September 5, 2023 I will just add a ME TOO to this thread. My system was rock solid under 6.11.5 and since I upgraded a few days ago to 6.12.4 I am having btrfs issues. I will try rolling back to 6.11.5. Quote Link to comment
krh1009 Posted September 15, 2023 Share Posted September 15, 2023 I'm adding a ME TOO. Same problem running 6.12.1 Quote Link to comment
ramius87 Posted December 6, 2023 Share Posted December 6, 2023 I did not know about this issue, but I am now experiencing the same after upgrading from 6.11.1 to 6.12.4. I will be converting my cache pool to zfs and reporting the results. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.