[solved] ssdcache Unmountable: No file system

KaBo · July 6, 2022

Unraid Version: 6.9.2

Hello folcs,

I run Unraid for a couple of months now, the server is often down and powred on via wol when needed. Yesterday I recognized that my VMs couldn't start. The reason: they're on ssdcache only and both SSDs are in state "Unmountable: No file system" 😲

Diagnostics: kunraid-diagnostics-20220706-2215.zip

Array and Pool:

Shares (part):

Any hint where to start to get this fixed?

Edited July 7, 2022 by KaBo

JorgeB · July 7, 2022

Jul  6 21:38:14 kunraid kernel: BTRFS: device fsid 0ecc5969-4afc-4b9a-a299-3fab30cf63d9 devid 1 transid 153079 /dev/sdb1 scanned by udevd (1244)
Jul  6 21:38:14 kunraid kernel: BTRFS: device fsid 0ecc5969-4afc-4b9a-a299-3fab30cf63d9 devid 2 transid 147224 /dev/sdc1 scanned by udevd (1234)

devid 2 is on an older generation than devid 1, this can sometimes fix it:

btrfs-select-super -s 1 /dev/sdc1

If you rebooted since the diags check that ssdcache2 is still sdc, then reboot and if it still doesn't mount post new diags.

KaBo · July 7, 2022

Thanks a lot @JorgeB! This exacly was it. But three questions remaining:

How did you find out? Is the last number in brackets from syslog lines the generation (the higher, the newer?) - and devid 2 has to be the newest?
Do I have to do anything else before the next reboot? sdb1 shows a lot of "bad tree block start" and "read error corrected" after reboot.
How did this happen and what can I do to avoid it? Any ideas?

Kai

JorgeB · July 8, 2022

10 hours ago, KaBo said:

Is the last number in brackets from syslog lines the generation

It's the transid, they must be the same for all devices in a pool in sync.

Please post new diags so I can better answer the other questions.

KaBo · July 8, 2022

I also recognized that 3 VMs are broken now: Linux and Windows 11 won't boot anymore 😕

Diagnostics: kunraid-diagnostics-20220708-1736.zip

JorgeB · July 8, 2022

Run a scrub on the pool, though like mentioned here nocow shares can't be fixed, since there are no checksums, your domains share is nocow, so assuming the vdisks are there you might need to restore them from backups if available, always recommend using cow shares for anything btrfs.

As for the cause of the problem, most likely a device/firmware/controller issue, some writes were lost, all should be recovered after a scrub (except nocow shares like mentioned).

[solved] ssdcache Unmountable: No file system

Recommended Posts

KaBo

Link to comment

JorgeB

Link to comment

KaBo

Link to comment

JorgeB

Link to comment

KaBo

Link to comment

JorgeB

Link to comment

Join the conversation