[cache pool] Unmountable: No file system (no btrfs devices)


ZikPhil

Recommended Posts

Hey guys,

 

I purchased 2x 1To SSD for a cache pool 2 days ago. I removed my existing single 500GB SSD and moved it to the array. Moved my files over to the new cache pool, everything worked great for 2 days. This morning after rebooting my server I get a message that both my cache drive are unmountable. I followed all the steps in the FAQ, still doesn't work. If I ask the UnRAID GUI to format my drives, it goes into "Formatting..." and then comes back with the same error.

 

The funny thing is that if I go in command line and manually mount my drive, it works. It just seems like UnRAID is unable to do it. I wished I had realized that before wiping both my disks but whatever. I am not sure what do you next.

 

Things I have tried:

- New Config

- Format using command line

- btrfsck

 

root@Jarvis:/mnt# sudo btrfsck --repair /dev/sdi1
enabling repair mode
Checking filesystem on /dev/sdi1
UUID: 70f15614-5284-4602-a83a-5e643e39406d
Fixed 0 roots.
checking extents
No device size related problem found
checking free space cache
checking fs roots
checking only csum items (without verifying data)
checking root refs
found 131072 bytes used, no error found
total csum bytes: 0
total tree bytes: 131072
total fs tree bytes: 32768
total extent tree bytes: 16384
btree space waste bytes: 125220
file data blocks allocated: 0

 

root@Jarvis:/mnt# btrfs check --repair /dev/sdi1
enabling repair mode
Checking filesystem on /dev/sdi1
UUID: 70f15614-5284-4602-a83a-5e643e39406d
Fixed 0 roots.
checking extents
No device size related problem found
checking free space cache
checking fs roots
checking only csum items (without verifying data)
checking root refs
found 131072 bytes used, no error found
total csum bytes: 0
total tree bytes: 131072
total fs tree bytes: 32768
total extent tree bytes: 16384
btree space waste bytes: 125220
file data blocks allocated: 0
 referenced 0

 

Yb6LAW4.png

jarvis-diagnostics-20181127-1142.zip

Edited by ZikPhil
Link to comment

Ok didn't realize you had a 'pool' (2 or more devices assigned to cache).

Stop array, set number of cache slots to 1, then Start array.  This time it should mount.  Let me know if that's the case and I'll give you some options for moving forward.

 

The problem stems from a bug (apparently) revealed by moving a device previously part of the cache pool to the parity array, while simultaneously assigning a different devices to the cache.  A single-slot cache device is treated differently in the code than when there are 2 or more slots assigned (this is to support backward compatibility for users upgrading from an older Unraid OS release where we only supported a single cache slot).

Link to comment

While I have the error after adding the second cache disk, output is:

root@Jarvis:~# blkid
/dev/loop0: TYPE="squashfs"
/dev/loop1: TYPE="squashfs"
/dev/sde1: LABEL="UNRAID" UUID="2732-64F5" TYPE="vfat"
/dev/sdg1: UUID="bf1ca78d-9a19-4d48-b3ab-731806e233d4" TYPE="xfs" PARTUUID="154c21ed-e4bd-4297-8e6d-065a26636261"
/dev/sdh1: LABEL="S.mkv^A" UUID="6b5fbbd4-fe06-9662-8eaf-e300881c9207" UUID_SUB="83ed1ac9-12e7-7ce8-987b-f5a9b78b49c7" TYPE="btrfs" PARTUUID="755c9b6e-6660-4f7b-af95-b44ef66ba1fe"
/dev/sdf1: UUID="9ca7cf49-1e4c-497e-abbf-3a0fb588b15c" TYPE="xfs"
/dev/sdk1: UUID="304393bd-fdc8-4668-8e61-ea879640a9f3" UUID_SUB="e28169e7-41d7-49ad-a84c-db9e85bb39e9" TYPE="btrfs" PARTUUID="65b9c9f1-81d2-4a7d-a226-9b06566c4c49"
/dev/sdi1: UUID="9802de40-7e6d-4357-ba36-4bcdafbf403a" UUID_SUB="f2583305-7f1a-4a3d-a2bb-fa600f4adaaa" TYPE="btrfs"
/dev/md1: UUID="304393bd-fdc8-4668-8e61-ea879640a9f3" UUID_SUB="e28169e7-41d7-49ad-a84c-db9e85bb39e9" TYPE="btrfs"
/dev/md2: UUID="bf1ca78d-9a19-4d48-b3ab-731806e233d4" TYPE="xfs"
/dev/md3: UUID="9ca7cf49-1e4c-497e-abbf-3a0fb588b15c" TYPE="xfs"
/dev/loop2: UUID="0d61dede-a6a7-43c8-a22a-49ce84900402" UUID_SUB="e5a994d2-87e6-4bfb-8aa8-573767621f8c" TYPE="btrfs"
/dev/loop3: UUID="7d9130fa-4957-4aa7-aa3b-28f0535a9750" UUID_SUB="d4825bd9-7e1f-467c-8a4e-451e8c2aa9ad" TYPE="btrfs"

With only 1 cache disk (sdi):

root@Jarvis:~# blkid
/dev/loop0: TYPE="squashfs"
/dev/loop1: TYPE="squashfs"
/dev/sde1: LABEL="UNRAID" UUID="2732-64F5" TYPE="vfat"
/dev/sdg1: UUID="bf1ca78d-9a19-4d48-b3ab-731806e233d4" TYPE="xfs" PARTUUID="154c21ed-e4bd-4297-8e6d-065a26636261"
/dev/sdh1: LABEL="S.mkv^A" UUID="6b5fbbd4-fe06-9662-8eaf-e300881c9207" UUID_SUB="83ed1ac9-12e7-7ce8-987b-f5a9b78b49c7" TYPE="btrfs" PARTUUID="755c9b6e-6660-4f7b-af95-b44ef66ba1fe"
/dev/sdf1: UUID="9ca7cf49-1e4c-497e-abbf-3a0fb588b15c" TYPE="xfs"
/dev/sdk1: UUID="304393bd-fdc8-4668-8e61-ea879640a9f3" UUID_SUB="e28169e7-41d7-49ad-a84c-db9e85bb39e9" TYPE="btrfs" PARTUUID="65b9c9f1-81d2-4a7d-a226-9b06566c4c49"
/dev/sdi1: UUID="9802de40-7e6d-4357-ba36-4bcdafbf403a" UUID_SUB="f2583305-7f1a-4a3d-a2bb-fa600f4adaaa" TYPE="btrfs"
/dev/md1: UUID="304393bd-fdc8-4668-8e61-ea879640a9f3" UUID_SUB="e28169e7-41d7-49ad-a84c-db9e85bb39e9" TYPE="btrfs"
/dev/md2: UUID="bf1ca78d-9a19-4d48-b3ab-731806e233d4" TYPE="xfs"
/dev/md3: UUID="9ca7cf49-1e4c-497e-abbf-3a0fb588b15c" TYPE="xfs"
/dev/loop2: UUID="0d61dede-a6a7-43c8-a22a-49ce84900402" UUID_SUB="e5a994d2-87e6-4bfb-8aa8-573767621f8c" TYPE="btrfs"
/dev/loop3: UUID="7d9130fa-4957-4aa7-aa3b-28f0535a9750" UUID_SUB="d4825bd9-7e1f-467c-8a4e-451e8c2aa9ad" TYPE="btrfs"

 

Link to comment
1 hour ago, ZikPhil said:

Thank you. That resolved the issue with 1 cache drive.

 

But as soon as I attached the other drive, it goes back to the 'No File System' issue.

 

What do you mean by "attached the other drive"?  When you increased the cache slot count from 1 to 2, did the second device show up already as 'assigned' and you didn't actually have to select from the dropdown?

Link to comment
23 minutes ago, limetech said:

 

What do you mean by "attached the other drive"?  When you increased the cache slot count from 1 to 2, did the second device show up already as 'assigned' and you didn't actually have to select from the dropdown?

 

The second output is when 1 Cache Drive working properly in the raid setup. The first output is when I increased the cache size to 2 and selected my drive from the dropdown.

Edited by ZikPhil
Link to comment

I can't see why it's not working but I do see something I believe it's not right, maybe a clue for Tom, after adding the 2nd device to the pool and starting the array syslog shows:

 

Nov 27 15:50:12 Jarvis emhttpd: cache TotDevices: 0
Nov 27 15:50:12 Jarvis emhttpd: cache NumDevices: 2
Nov 27 15:50:12 Jarvis emhttpd: cache NumFound: 0
Nov 27 15:50:12 Jarvis emhttpd: cache NumMissing: 0
Nov 27 15:50:12 Jarvis emhttpd: cache NumMisplaced: 0
Nov 27 15:50:12 Jarvis emhttpd: cache NumExtra: 2
Nov 27 15:50:12 Jarvis emhttpd: cache LuksState: 0

 

I believe it should be:

 

Nov 27 15:50:12 Jarvis emhttpd: cache TotDevices: 1
Nov 27 15:50:12 Jarvis emhttpd: cache NumDevices: 2
Nov 27 15:50:12 Jarvis emhttpd: cache NumFound: 1
Nov 27 15:50:12 Jarvis emhttpd: cache NumMissing: 0
Nov 27 15:50:12 Jarvis emhttpd: cache NumMisplaced: 0
Nov 27 15:50:12 Jarvis emhttpd: cache NumExtra: 1
Nov 27 15:50:12 Jarvis emhttpd: cache LuksState: 0

 

 

 

Edited by johnnie.black
Link to comment
28 minutes ago, johnnie.black said:

I believe it should be:

You believe correctly.  My suspicion is that linux 'blkid' subsystem is foobar'ed for some reason and after OP reports back, probably a reboot will solve the issue (though there probably exists a bug that got him in this state to begin with).  btrfs tools rely heavily on blkid subsystem and I remember when we first integrated multi-device btrfs cache pool, blkid was quite 'buggy' at the time.

Link to comment

I performed several reboots since then without much luck.

 

root@Jarvis:~# btrfs fi show afb81ffe-a859-49e9-b7ec-9bf15c6a167b
ERROR: cannot scan /dev/sdh1: Input/output error
Label: none  uuid: afb81ffe-a859-49e9-b7ec-9bf15c6a167b
	Total devices 1 FS bytes used 8.97GiB
	devid    1 size 931.51GiB used 10.02GiB path /dev/sdi1

 

Link to comment
3 hours ago, limetech said:

You believe correctly.  My suspicion is that linux 'blkid' subsystem is foobar'ed for some reason and after OP reports back, probably a reboot will solve the issue (though there probably exists a bug that got him in this state to begin with).  btrfs tools rely heavily on blkid subsystem and I remember when we first integrated multi-device btrfs cache pool, blkid was quite 'buggy' at the time.

 

If you want to provide a public ssh key, I can open up the machine for you to connect into.

Link to comment

Ok I see what's happening.  The easiest/fastest way for you solve this now is to reassign your parity device: instead of assigning to Parity, assign to Parity 2.  Alternately, if feasible, empty Disk 1 and reformat it with an xfs file system.

 

The problem is this line in 'btrfs fi show' output:

ERROR: cannot scan /dev/sdh1: Input/output error

Recall that each Parity block is the XOR of each corresponding Data block.  You have a single btrfs disk (Disk 1) and two xfs disks (Disk 2 and 3).  Apparently the XOR of the block that contains the btrfs signature is present on the Parity device.  Thus 'btrfs fi show' tries to figure out if this device happens to be part of the pool, and in that process tries to access a block outside the logical block range (because Parity is not a real btrfs device) so it generates that error  message.  But then the Unraid code that tries to determine the state of the pool sees that ERROR and bails out, marking the pool as 'unmountable'.

Link to comment
5 minutes ago, ZikPhil said:

I will try that today, out of curiosity. Is it better to have everything as xfs? It seems to automatically go in btrfs

You can use XFS or BTRFS for the cache drive if you only have a single drive in the cache.   The moment you want multiple drives in the cache then BTRFS is the only option.

Link to comment

I just now saw this thread (stupid SPAM blocker, I didn't get the email to approve your account).

 

Maybe I missed it, but I didn't see anybody trying to dissuade you from using an SSD in your array. SSDs in the array cannot be trimmed, can only write at the speed of parity, and there has been some discussion of whether or not it could even break parity.

 

Doesn't look like you desperately need the added capacity so I would recommend removing it from the array.

 

 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.