[cache pool] Unmountable: No file system (no btrfs devices)

ZikPhil · November 27, 2018

Hey guys,

I purchased 2x 1To SSD for a cache pool 2 days ago. I removed my existing single 500GB SSD and moved it to the array. Moved my files over to the new cache pool, everything worked great for 2 days. This morning after rebooting my server I get a message that both my cache drive are unmountable. I followed all the steps in the FAQ, still doesn't work. If I ask the UnRAID GUI to format my drives, it goes into "Formatting..." and then comes back with the same error.

The funny thing is that if I go in command line and manually mount my drive, it works. It just seems like UnRAID is unable to do it. I wished I had realized that before wiping both my disks but whatever. I am not sure what do you next.

Things I have tried:

- New Config

- Format using command line

- btrfsck

root@Jarvis:/mnt# sudo btrfsck --repair /dev/sdi1
enabling repair mode
Checking filesystem on /dev/sdi1
UUID: 70f15614-5284-4602-a83a-5e643e39406d
Fixed 0 roots.
checking extents
No device size related problem found
checking free space cache
checking fs roots
checking only csum items (without verifying data)
checking root refs
found 131072 bytes used, no error found
total csum bytes: 0
total tree bytes: 131072
total fs tree bytes: 32768
total extent tree bytes: 16384
btree space waste bytes: 125220
file data blocks allocated: 0

root@Jarvis:/mnt# btrfs check --repair /dev/sdi1
enabling repair mode
Checking filesystem on /dev/sdi1
UUID: 70f15614-5284-4602-a83a-5e643e39406d
Fixed 0 roots.
checking extents
No device size related problem found
checking free space cache
checking fs roots
checking only csum items (without verifying data)
checking root refs
found 131072 bytes used, no error found
total csum bytes: 0
total tree bytes: 131072
total fs tree bytes: 32768
total extent tree bytes: 16384
btree space waste bytes: 125220
file data blocks allocated: 0
 referenced 0

jarvis-diagnostics-20181127-1142.zip

Edited November 27, 2018 by ZikPhil

limetech · November 27, 2018

Please try this:

Stop array
Unassign the cache device completely
Start array
Stop array
Assign your cache device again.
Start array

Does the cache device now mount?

ZikPhil · November 27, 2018

After doing that, I get the following:

limetech · November 27, 2018

Ok didn't realize you had a 'pool' (2 or more devices assigned to cache).

Stop array, set number of cache slots to 1, then Start array. This time it should mount. Let me know if that's the case and I'll give you some options for moving forward.

The problem stems from a bug (apparently) revealed by moving a device previously part of the cache pool to the parity array, while simultaneously assigning a different devices to the cache. A single-slot cache device is treated differently in the code than when there are 2 or more slots assigned (this is to support backward compatibility for users upgrading from an older Unraid OS release where we only supported a single cache slot).

ZikPhil · November 27, 2018

Thank you. That resolved the issue with 1 cache drive.

But as soon as I attached the other drive, it goes back to the 'No File System' issue.

JorgeB · November 27, 2018

If there's no data on the devices they should format as a pool after wiping both with:

blkdiscard /dev/sdX

limetech · November 27, 2018

Please post output of this command:

blkid

ZikPhil · November 27, 2018

While I have the error after adding the second cache disk, output is:

root@Jarvis:~# blkid
/dev/loop0: TYPE="squashfs"
/dev/loop1: TYPE="squashfs"
/dev/sde1: LABEL="UNRAID" UUID="2732-64F5" TYPE="vfat"
/dev/sdg1: UUID="bf1ca78d-9a19-4d48-b3ab-731806e233d4" TYPE="xfs" PARTUUID="154c21ed-e4bd-4297-8e6d-065a26636261"
/dev/sdh1: LABEL="S.mkv^A" UUID="6b5fbbd4-fe06-9662-8eaf-e300881c9207" UUID_SUB="83ed1ac9-12e7-7ce8-987b-f5a9b78b49c7" TYPE="btrfs" PARTUUID="755c9b6e-6660-4f7b-af95-b44ef66ba1fe"
/dev/sdf1: UUID="9ca7cf49-1e4c-497e-abbf-3a0fb588b15c" TYPE="xfs"
/dev/sdk1: UUID="304393bd-fdc8-4668-8e61-ea879640a9f3" UUID_SUB="e28169e7-41d7-49ad-a84c-db9e85bb39e9" TYPE="btrfs" PARTUUID="65b9c9f1-81d2-4a7d-a226-9b06566c4c49"
/dev/sdi1: UUID="9802de40-7e6d-4357-ba36-4bcdafbf403a" UUID_SUB="f2583305-7f1a-4a3d-a2bb-fa600f4adaaa" TYPE="btrfs"
/dev/md1: UUID="304393bd-fdc8-4668-8e61-ea879640a9f3" UUID_SUB="e28169e7-41d7-49ad-a84c-db9e85bb39e9" TYPE="btrfs"
/dev/md2: UUID="bf1ca78d-9a19-4d48-b3ab-731806e233d4" TYPE="xfs"
/dev/md3: UUID="9ca7cf49-1e4c-497e-abbf-3a0fb588b15c" TYPE="xfs"
/dev/loop2: UUID="0d61dede-a6a7-43c8-a22a-49ce84900402" UUID_SUB="e5a994d2-87e6-4bfb-8aa8-573767621f8c" TYPE="btrfs"
/dev/loop3: UUID="7d9130fa-4957-4aa7-aa3b-28f0535a9750" UUID_SUB="d4825bd9-7e1f-467c-8a4e-451e8c2aa9ad" TYPE="btrfs"

With only 1 cache disk (sdi):

root@Jarvis:~# blkid
/dev/loop0: TYPE="squashfs"
/dev/loop1: TYPE="squashfs"
/dev/sde1: LABEL="UNRAID" UUID="2732-64F5" TYPE="vfat"
/dev/sdg1: UUID="bf1ca78d-9a19-4d48-b3ab-731806e233d4" TYPE="xfs" PARTUUID="154c21ed-e4bd-4297-8e6d-065a26636261"
/dev/sdh1: LABEL="S.mkv^A" UUID="6b5fbbd4-fe06-9662-8eaf-e300881c9207" UUID_SUB="83ed1ac9-12e7-7ce8-987b-f5a9b78b49c7" TYPE="btrfs" PARTUUID="755c9b6e-6660-4f7b-af95-b44ef66ba1fe"
/dev/sdf1: UUID="9ca7cf49-1e4c-497e-abbf-3a0fb588b15c" TYPE="xfs"
/dev/sdk1: UUID="304393bd-fdc8-4668-8e61-ea879640a9f3" UUID_SUB="e28169e7-41d7-49ad-a84c-db9e85bb39e9" TYPE="btrfs" PARTUUID="65b9c9f1-81d2-4a7d-a226-9b06566c4c49"
/dev/sdi1: UUID="9802de40-7e6d-4357-ba36-4bcdafbf403a" UUID_SUB="f2583305-7f1a-4a3d-a2bb-fa600f4adaaa" TYPE="btrfs"
/dev/md1: UUID="304393bd-fdc8-4668-8e61-ea879640a9f3" UUID_SUB="e28169e7-41d7-49ad-a84c-db9e85bb39e9" TYPE="btrfs"
/dev/md2: UUID="bf1ca78d-9a19-4d48-b3ab-731806e233d4" TYPE="xfs"
/dev/md3: UUID="9ca7cf49-1e4c-497e-abbf-3a0fb588b15c" TYPE="xfs"
/dev/loop2: UUID="0d61dede-a6a7-43c8-a22a-49ce84900402" UUID_SUB="e5a994d2-87e6-4bfb-8aa8-573767621f8c" TYPE="btrfs"
/dev/loop3: UUID="7d9130fa-4957-4aa7-aa3b-28f0535a9750" UUID_SUB="d4825bd9-7e1f-467c-8a4e-451e8c2aa9ad" TYPE="btrfs"

ZikPhil · November 27, 2018

47 minutes ago, johnnie.black said:
If there's no data on the devices they should format as a pool after wiping both with:
blkdiscard /dev/sdX

Command ran but did not fix the situation.

limetech · November 27, 2018

1 hour ago, ZikPhil said:

Thank you. That resolved the issue with 1 cache drive.

But as soon as I attached the other drive, it goes back to the 'No File System' issue.

What do you mean by "attached the other drive"? When you increased the cache slot count from 1 to 2, did the second device show up already as 'assigned' and you didn't actually have to select from the dropdown?

JorgeB · November 27, 2018

Also please post new diags after attempting to format the pool, there was recently a case where Unraid was trying to format a pool with XFS.

ZikPhil · November 27, 2018

23 minutes ago, limetech said:

What do you mean by "attached the other drive"? When you increased the cache slot count from 1 to 2, did the second device show up already as 'assigned' and you didn't actually have to select from the dropdown?

The second output is when 1 Cache Drive working properly in the raid setup. The first output is when I increased the cache size to 2 and selected my drive from the dropdown.

Edited November 27, 2018 by ZikPhil

ZikPhil · November 27, 2018

Here is the video of the attempt and the diagnostic after.

jarvis-diagnostics-20181127-1551.zip

limetech · November 27, 2018

In this state, please post output of:

btrfs fi show afb81ffe-a859-49e9-b7ec-9bf15c6a167b

JorgeB · November 28, 2018

I can't see why it's not working but I do see something I believe it's not right, maybe a clue for Tom, after adding the 2nd device to the pool and starting the array syslog shows:

Nov 27 15:50:12 Jarvis emhttpd: cache TotDevices: 0
Nov 27 15:50:12 Jarvis emhttpd: cache NumDevices: 2
Nov 27 15:50:12 Jarvis emhttpd: cache NumFound: 0
Nov 27 15:50:12 Jarvis emhttpd: cache NumMissing: 0
Nov 27 15:50:12 Jarvis emhttpd: cache NumMisplaced: 0
Nov 27 15:50:12 Jarvis emhttpd: cache NumExtra: 2
Nov 27 15:50:12 Jarvis emhttpd: cache LuksState: 0

I believe it should be:

Nov 27 15:50:12 Jarvis emhttpd: cache TotDevices: 1
Nov 27 15:50:12 Jarvis emhttpd: cache NumDevices: 2
Nov 27 15:50:12 Jarvis emhttpd: cache NumFound: 1
Nov 27 15:50:12 Jarvis emhttpd: cache NumMissing: 0
Nov 27 15:50:12 Jarvis emhttpd: cache NumMisplaced: 0
Nov 27 15:50:12 Jarvis emhttpd: cache NumExtra: 1
Nov 27 15:50:12 Jarvis emhttpd: cache LuksState: 0

Edited November 28, 2018 by johnnie.black

limetech · November 28, 2018

28 minutes ago, johnnie.black said:

I believe it should be:

You believe correctly. My suspicion is that linux 'blkid' subsystem is foobar'ed for some reason and after OP reports back, probably a reboot will solve the issue (though there probably exists a bug that got him in this state to begin with). btrfs tools rely heavily on blkid subsystem and I remember when we first integrated multi-device btrfs cache pool, blkid was quite 'buggy' at the time.

ZikPhil · November 28, 2018

I performed several reboots since then without much luck.

root@Jarvis:~# btrfs fi show afb81ffe-a859-49e9-b7ec-9bf15c6a167b
ERROR: cannot scan /dev/sdh1: Input/output error
Label: none  uuid: afb81ffe-a859-49e9-b7ec-9bf15c6a167b
	Total devices 1 FS bytes used 8.97GiB
	devid    1 size 931.51GiB used 10.02GiB path /dev/sdi1

ZikPhil · November 28, 2018

3 hours ago, limetech said:

You believe correctly. My suspicion is that linux 'blkid' subsystem is foobar'ed for some reason and after OP reports back, probably a reboot will solve the issue (though there probably exists a bug that got him in this state to begin with). btrfs tools rely heavily on blkid subsystem and I remember when we first integrated multi-device btrfs cache pool, blkid was quite 'buggy' at the time.

If you want to provide a public ssh key, I can open up the machine for you to connect into.

limetech · November 28, 2018

Ok I see what's happening. The easiest/fastest way for you solve this now is to reassign your parity device: instead of assigning to Parity, assign to Parity 2. Alternately, if feasible, empty Disk 1 and reformat it with an xfs file system.

The problem is this line in 'btrfs fi show' output:

ERROR: cannot scan /dev/sdh1: Input/output error

Recall that each Parity block is the XOR of each corresponding Data block. You have a single btrfs disk (Disk 1) and two xfs disks (Disk 2 and 3). Apparently the XOR of the block that contains the btrfs signature is present on the Parity device. Thus 'btrfs fi show' tries to figure out if this device happens to be part of the pool, and in that process tries to access a block outside the logical block range (because Parity is not a real btrfs device) so it generates that error message. But then the Unraid code that tries to determine the state of the pool sees that ERROR and bails out, marking the pool as 'unmountable'.

ZikPhil · November 28, 2018

I will try that today, out of curiosity. Is it better to have everything as xfs? It seems to automatically go in btrfs

itimpi · November 28, 2018

5 minutes ago, ZikPhil said:

I will try that today, out of curiosity. Is it better to have everything as xfs? It seems to automatically go in btrfs

You can use XFS or BTRFS for the cache drive if you only have a single drive in the cache. The moment you want multiple drives in the cache then BTRFS is the only option.

limetech · November 28, 2018

2 hours ago, itimpi said:

You can use XFS or BTRFS for the cache drive if you only have a single drive in the cache. The moment you want multiple drives in the cache then BTRFS is the only option.

Correct. And the default for parity-protected array data disks is xfs.

ZikPhil · November 30, 2018

Hey guys, sorry it took several days to transfer the files, re-format to xfs, put them back on and wait for parity to rebuild from scratch. After that was done successfully, I re-added the second Cache SSD and it worked!! :)

Thank you all for your help.

ZikPhil · November 30, 2018

trurl · December 3, 2018

I just now saw this thread (stupid SPAM blocker, I didn't get the email to approve your account).

Maybe I missed it, but I didn't see anybody trying to dissuade you from using an SSD in your array. SSDs in the array cannot be trimmed, can only write at the speed of parity, and there has been some discussion of whether or not it could even break parity.

Doesn't look like you desperately need the added capacity so I would recommend removing it from the array.

[cache pool] Unmountable: No file system (no btrfs devices)

Recommended Posts

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Join the conversation