Skip to content
View in the app

A better way to browse. Learn more.

Unraid

A full-screen app on your home screen with push notifications, badges and more.

To install this app on iOS and iPadOS
  1. Tap the Share icon in Safari
  2. Scroll the menu and tap Add to Home Screen.
  3. Tap Add in the top-right corner.
To install this app on Android
  1. Tap the 3-dot menu (⋮) in the top-right corner of the browser.
  2. Tap Add to Home screen or Install app.
  3. Confirm by tapping Install.

BTRFS errors, docker and vms won't start

Featured Replies

I'm seeing lots of BTRFS errors in logs, examples below...


 

Nov  8 04:46:47 Max kernel: BTRFS warning (device nvme0n1p1: state EA): csum failed root 5 ino 18493665 off 180224 csum 0x27aafbf4 expected csum 0x27aafb74 mirror 1
Nov  8 04:46:47 Max kernel: BTRFS error (device nvme0n1p1: state EA): bdev /dev/nvme0n1p1 errs: wr 0, rd 0, flush 0, corrupt 224, gen 0



Nov  8 09:15:13 Max kernel: BTRFS info (device loop3: state E): forced readonly
Nov  8 09:15:13 Max kernel: BTRFS warning (device loop3: state E): Skipping commit of aborted transaction.
Nov  8 09:15:13 Max kernel: BTRFS: error (device loop3: state EA) in cleanup_transaction:1992: errno=-5 IO failure

 

Obviously the docker and vm issues are symptoms of 'forced readonly'. But not sure what would be causing the BTRFS issues, or how to recover. Any help would be greatly appreciated. Diagnostics attached

max-diagnostics-20241108-0936.zip

  • Community Expert

Btrfs csum errors are usually caused by bad RAM. 

  • Author

Hi, thanks for the suggestion. I've swapped out the RAM for new modules and I'm still getting issues. It seems I can start the server, log on and everything's fine but as soon as I start the array it craps out

  • Author

I've run SMART self-test on the cache drive and no issues found, although I noticed error count of 31, but no further information... image.png.dcb4c91697a3a1e986b95fc8e476827a.png

  • Community Expert

Run a correcting scrub on the pool and post the results.

  • Author

I just want to check before running... is this right? Just click scrub with 'repair corrupted blocks' selected?image.thumb.png.f9f9127e098458e304dbef58263c37fd.png

  • Community Expert

Yep

  • Author

I've started the process 3 times now. First time it stopped after 600MB, I left it overnight in case it's a super long running process, but it didn't progress. Had to hard reboot the server as unraid wasn't responsive. Just run it again and it stopped at 10% with the following... image.thumb.png.d0ab7b94299ea8d2922e53e353f4f03e.png

  • Community Expert
25 minutes ago, aphillippe said:

Just run it again and it stopped at 10% with the following... 

Do you mean the scrub stopped, or the server stopped responding?

 

If the former, also post new diags please.

  • Author

Retried several times and it finally finished.

image.png.22130a0d098d655837d155a9764a6e84.png

image.thumb.png.d8496866be4a81d09ef1c782d8b4fa78.png

 

I think there was an issue with docker or VMs flooding the network. After starting the array I was noticing network issues (other devices losing network/internet). Unplugging the unraid server from the switch solved it instantly. It only happened after the array so I assume it was docker or a VM. So I suspect the scrub may have been finishing in the background but the network issues were causing the UI and ssh to become unresponsive. The above was after I disabled docker and VMs and ran it again. Seems stable with the array started but no dockers/VMs for now.

 

So, is my SSD toast? Do I wipe and reinstall? Can I restore from my appdata backup or will that data be corrupt too? Thanks

Edited by aphillippe

  • Community Expert

You should delete/restore the files mentioned in the syslog, they have data corruption, then re-run the scrub to confirm 0 errors.

  • Author

Ok, I've deleted all the mentioned files. Rerunning the scrub, it seems to be stuck doing nothing... image.thumb.png.bbb745c5078954324ddefc52dfbe0099.png

 

I've hit cancel, shows up in logs but doesn't seem to cancel.

I've tried to reboot the server but logs now show this... image.thumb.png.479779c1818d19d5476d48cb8d1ca163.png

And no reboot. I'm guessing there's something more fundamental that just those three files?

 

Thanks for the help so far, by the way. Much appreciated

  • Community Expert

image.png

This error usually means bad RAM, or board/CPU.

  • Author

I've already replaced the RAM. What's the next step? I don't have spare board or CPU to swap out. Return both? Would wiping the SSD and putting new file system on there help? Or would this (or other similar) issues likely surface again? I'm a bit stuck at this point

  • Author

I left it running overnight and got this in the logs, if it helps... image.thumb.png.ff4229370a61df90bb69d7d19020c25f.png

  • Community Expert
24 minutes ago, aphillippe said:

Would wiping the SSD and putting new file system on there help?

You can try, and if new issues continue to appear, it will basically point to a hardware problem.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

Account

Navigation

Search

Search

Configure browser push notifications

Chrome (Android)
  1. Tap the lock icon next to the address bar.
  2. Tap Permissions → Notifications.
  3. Adjust your preference.
Chrome (Desktop)
  1. Click the padlock icon in the address bar.
  2. Select Site settings.
  3. Find Notifications and adjust your preference.