Cannot boot: BTRFS error ... help please everything is down!

Ystebad · November 18, 2021

server went down. Was unable to stream over WAN.

came home to find it nonresponsive over the lan via IP login. wouldn't respond to ping

rebooted and it will get to login but shortly (within a few seconds) then locks up saying:

"login: BTRFS: error (device nvme0n1p1) in btrfs_replay_log:2279: errno=-22 unknown (Failed to recover log tree).

I can boot to safe mode.

please help I need to get server back up and running.

Edited November 18, 2021 by Ystebad

Squid · November 18, 2021

1 hour ago, Ystebad said:

"login: BTRFS: error (device nvme0n1p1) in btrfs_replay_log:2279: errno=-22 unknown (Failed to recover log tree).

When you see this next, press enter then login normally at the command prompt, followed by

diagnostics

The diagnostics file will get saved onto the flash drive (logs folder)

powerdown

then pull the flash and upload the entire applicable zip file here

Ystebad · November 18, 2021

It won’t let me login. The login prompt appears first but before I can enter my name and password the error message will occur and then nothing will work after that. It’s locked up

just before login it says warning: commands will be executed using /bin/sh

JorgeB · November 18, 2021

The btrfs error shouldn't prevent the server from starting, confirmed by the fact that you can boot in safe mode, that suggest a plugin or other configuration problem, you can post the diags after booting in safe mode so we can check the filesystem problem first.

Ystebad · November 18, 2021

11 hours ago, Squid said:
When you see this next, press enter then login normally at the command prompt, followed by
diagnostics
The diagnostics file will get saved onto the flash drive (logs folder)
powerdown
then pull the flash and upload the entire applicable zip file here

@Squid @JorgeB- ran diagnostics from safe boot. Attached. Appreciate any help as machine will still not boot in normal (non-gui) mode

monster-diagnostics-20211118-0831.zip

Edited November 18, 2021 by Ystebad

JorgeB · November 19, 2021

Diags after array start please.

Ystebad · November 19, 2021

3 hours ago, JorgeB said:

Diags after array start please.

I didn't realize the IP server login would work in safe mode .. duh.

This time after booting in safe mode I logged into server over IP to get access to GUI and start array. It did note an error with one of my 3 cache disks and mentioned I would have to format it. I hope not.

New diagnostics attached.

monster-diagnostics-20211119-0657.zip

JorgeB · November 19, 2021

47 minutes ago, Ystebad said:

It did note an error with one of my 3 cache disks and mentioned I would have to format it. I hope not.

There's fs corruption on that device, you can see here some recovery options.

Ystebad · November 19, 2021

17 minutes ago, JorgeB said:

There's fs corruption on that device, you can see here some recovery options.

My dockers (appears) all used this drive as cache prefer. If I remove this drive or cannot recover it do I lose all my docker setup?

Edited November 19, 2021 by Ystebad

JorgeB · November 19, 2021

43 minutes ago, Ystebad said:

If I remove this drive or cannot recover it do I lose all my docker setup?

If don't have an appdata backup you'll lose the settings.

Ystebad · November 19, 2021

26 minutes ago, JorgeB said:

If don't have an appdata backup you'll lose the settings.

Well @$%^ me. Here I thought have 2 parity drives would keep this kind of thing from happening. So disappointing. I've ordered 2 new drives and will be installing into mirror for cache going forward. Is there a best practice for auto-backup of app data?

JorgeB · November 19, 2021

16 minutes ago, Ystebad said:

Is there a best practice for auto-backup of app data?

You can use for example the CA backup and restore plugin.

trurl · November 19, 2021

19 minutes ago, Ystebad said:

I thought have 2 parity drives would keep this kind of thing from happening.

20 minutes ago, Ystebad said:

mirror for cache going forward

And in general, 2 parity drives are not a substitute for backup, and neither is mirrored cache. Lots of ways to lose data that neither of those will help with, including user error.

Ystebad · November 19, 2021

9 minutes ago, JorgeB said:

You can use for example the CA backup and restore plugin.

Now that I think about it, I -might- have installed this. Since I started in safe mode no plugins show up so I'm not sure how to see if I actually have a backup.

Ystebad · November 19, 2021

2 hours ago, JorgeB said:

There's fs corruption on that device, you can see here some recovery options.

trying to follow that - the dead drive is called "Cache_nvme" under device and SPCC_M.2_PCIe_SSD_CD8E07080A9E04566172 (nvme0n1) under identification.

I don't see either of these under /dev

not sure therefore what to use for the "mount -o usebackuproot,ro /dev/sdX1 /x" argument for /sdX1

thank you so much for the help.

Edited November 19, 2021 by Ystebad

JorgeB · November 19, 2021

For NVMe devices is /dev/nvmeXn1p1

Replace X with correct number.

Ystebad · November 19, 2021

22 minutes ago, JorgeB said:

For NVMe devices is /dev/nvmeXn1p1

Replace X with correct number.

Got this:

root@Monster:/dev# mount -o usebackuproot,ro /dev/nvme0n1p1 /x
mount: /x: wrong fs type, bad option, bad superblock on /dev/nvme0n1p1, missing codepage or helper program, or other error.

same result if I use the degraded option as well.

JorgeB · November 19, 2021

You can try btrfs repair as a last resort but probably will need to restore from backups.

Ystebad · November 19, 2021

58 minutes ago, JorgeB said:

You can try btrfs repair as a last resort but probably will need to restore from backups.

so is there any idea whether this is a file system error (ie should have used Xfs not btrfs) , software glitch or it’s a drive hardware problem?

JorgeB · November 19, 2021

In my experience btrfs fs corruption is most often caused by hardware issues, mostly RAM or device firmware related.

Cannot boot: BTRFS error ... help please everything is down!

Recommended Posts

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Join the conversation