ZikPhil Posted November 27, 2018 Share Posted November 27, 2018 (edited) Hey guys, I purchased 2x 1To SSD for a cache pool 2 days ago. I removed my existing single 500GB SSD and moved it to the array. Moved my files over to the new cache pool, everything worked great for 2 days. This morning after rebooting my server I get a message that both my cache drive are unmountable. I followed all the steps in the FAQ, still doesn't work. If I ask the UnRAID GUI to format my drives, it goes into "Formatting..." and then comes back with the same error. The funny thing is that if I go in command line and manually mount my drive, it works. It just seems like UnRAID is unable to do it. I wished I had realized that before wiping both my disks but whatever. I am not sure what do you next. Things I have tried: - New Config - Format using command line - btrfsck root@Jarvis:/mnt# sudo btrfsck --repair /dev/sdi1 enabling repair mode Checking filesystem on /dev/sdi1 UUID: 70f15614-5284-4602-a83a-5e643e39406d Fixed 0 roots. checking extents No device size related problem found checking free space cache checking fs roots checking only csum items (without verifying data) checking root refs found 131072 bytes used, no error found total csum bytes: 0 total tree bytes: 131072 total fs tree bytes: 32768 total extent tree bytes: 16384 btree space waste bytes: 125220 file data blocks allocated: 0 root@Jarvis:/mnt# btrfs check --repair /dev/sdi1 enabling repair mode Checking filesystem on /dev/sdi1 UUID: 70f15614-5284-4602-a83a-5e643e39406d Fixed 0 roots. checking extents No device size related problem found checking free space cache checking fs roots checking only csum items (without verifying data) checking root refs found 131072 bytes used, no error found total csum bytes: 0 total tree bytes: 131072 total fs tree bytes: 32768 total extent tree bytes: 16384 btree space waste bytes: 125220 file data blocks allocated: 0 referenced 0 jarvis-diagnostics-20181127-1142.zip Edited November 27, 2018 by ZikPhil Quote Link to comment
limetech Posted November 27, 2018 Share Posted November 27, 2018 Please try this: Stop array Unassign the cache device completely Start array Stop array Assign your cache device again. Start array Does the cache device now mount? Quote Link to comment
ZikPhil Posted November 27, 2018 Author Share Posted November 27, 2018 After doing that, I get the following: Quote Link to comment
limetech Posted November 27, 2018 Share Posted November 27, 2018 Ok didn't realize you had a 'pool' (2 or more devices assigned to cache). Stop array, set number of cache slots to 1, then Start array. This time it should mount. Let me know if that's the case and I'll give you some options for moving forward. The problem stems from a bug (apparently) revealed by moving a device previously part of the cache pool to the parity array, while simultaneously assigning a different devices to the cache. A single-slot cache device is treated differently in the code than when there are 2 or more slots assigned (this is to support backward compatibility for users upgrading from an older Unraid OS release where we only supported a single cache slot). Quote Link to comment
ZikPhil Posted November 27, 2018 Author Share Posted November 27, 2018 Thank you. That resolved the issue with 1 cache drive. But as soon as I attached the other drive, it goes back to the 'No File System' issue. Quote Link to comment
JorgeB Posted November 27, 2018 Share Posted November 27, 2018 If there's no data on the devices they should format as a pool after wiping both with: blkdiscard /dev/sdX Quote Link to comment
limetech Posted November 27, 2018 Share Posted November 27, 2018 Please post output of this command: blkid Quote Link to comment
ZikPhil Posted November 27, 2018 Author Share Posted November 27, 2018 While I have the error after adding the second cache disk, output is: root@Jarvis:~# blkid /dev/loop0: TYPE="squashfs" /dev/loop1: TYPE="squashfs" /dev/sde1: LABEL="UNRAID" UUID="2732-64F5" TYPE="vfat" /dev/sdg1: UUID="bf1ca78d-9a19-4d48-b3ab-731806e233d4" TYPE="xfs" PARTUUID="154c21ed-e4bd-4297-8e6d-065a26636261" /dev/sdh1: LABEL="S.mkv^A" UUID="6b5fbbd4-fe06-9662-8eaf-e300881c9207" UUID_SUB="83ed1ac9-12e7-7ce8-987b-f5a9b78b49c7" TYPE="btrfs" PARTUUID="755c9b6e-6660-4f7b-af95-b44ef66ba1fe" /dev/sdf1: UUID="9ca7cf49-1e4c-497e-abbf-3a0fb588b15c" TYPE="xfs" /dev/sdk1: UUID="304393bd-fdc8-4668-8e61-ea879640a9f3" UUID_SUB="e28169e7-41d7-49ad-a84c-db9e85bb39e9" TYPE="btrfs" PARTUUID="65b9c9f1-81d2-4a7d-a226-9b06566c4c49" /dev/sdi1: UUID="9802de40-7e6d-4357-ba36-4bcdafbf403a" UUID_SUB="f2583305-7f1a-4a3d-a2bb-fa600f4adaaa" TYPE="btrfs" /dev/md1: UUID="304393bd-fdc8-4668-8e61-ea879640a9f3" UUID_SUB="e28169e7-41d7-49ad-a84c-db9e85bb39e9" TYPE="btrfs" /dev/md2: UUID="bf1ca78d-9a19-4d48-b3ab-731806e233d4" TYPE="xfs" /dev/md3: UUID="9ca7cf49-1e4c-497e-abbf-3a0fb588b15c" TYPE="xfs" /dev/loop2: UUID="0d61dede-a6a7-43c8-a22a-49ce84900402" UUID_SUB="e5a994d2-87e6-4bfb-8aa8-573767621f8c" TYPE="btrfs" /dev/loop3: UUID="7d9130fa-4957-4aa7-aa3b-28f0535a9750" UUID_SUB="d4825bd9-7e1f-467c-8a4e-451e8c2aa9ad" TYPE="btrfs" With only 1 cache disk (sdi): root@Jarvis:~# blkid /dev/loop0: TYPE="squashfs" /dev/loop1: TYPE="squashfs" /dev/sde1: LABEL="UNRAID" UUID="2732-64F5" TYPE="vfat" /dev/sdg1: UUID="bf1ca78d-9a19-4d48-b3ab-731806e233d4" TYPE="xfs" PARTUUID="154c21ed-e4bd-4297-8e6d-065a26636261" /dev/sdh1: LABEL="S.mkv^A" UUID="6b5fbbd4-fe06-9662-8eaf-e300881c9207" UUID_SUB="83ed1ac9-12e7-7ce8-987b-f5a9b78b49c7" TYPE="btrfs" PARTUUID="755c9b6e-6660-4f7b-af95-b44ef66ba1fe" /dev/sdf1: UUID="9ca7cf49-1e4c-497e-abbf-3a0fb588b15c" TYPE="xfs" /dev/sdk1: UUID="304393bd-fdc8-4668-8e61-ea879640a9f3" UUID_SUB="e28169e7-41d7-49ad-a84c-db9e85bb39e9" TYPE="btrfs" PARTUUID="65b9c9f1-81d2-4a7d-a226-9b06566c4c49" /dev/sdi1: UUID="9802de40-7e6d-4357-ba36-4bcdafbf403a" UUID_SUB="f2583305-7f1a-4a3d-a2bb-fa600f4adaaa" TYPE="btrfs" /dev/md1: UUID="304393bd-fdc8-4668-8e61-ea879640a9f3" UUID_SUB="e28169e7-41d7-49ad-a84c-db9e85bb39e9" TYPE="btrfs" /dev/md2: UUID="bf1ca78d-9a19-4d48-b3ab-731806e233d4" TYPE="xfs" /dev/md3: UUID="9ca7cf49-1e4c-497e-abbf-3a0fb588b15c" TYPE="xfs" /dev/loop2: UUID="0d61dede-a6a7-43c8-a22a-49ce84900402" UUID_SUB="e5a994d2-87e6-4bfb-8aa8-573767621f8c" TYPE="btrfs" /dev/loop3: UUID="7d9130fa-4957-4aa7-aa3b-28f0535a9750" UUID_SUB="d4825bd9-7e1f-467c-8a4e-451e8c2aa9ad" TYPE="btrfs" Quote Link to comment
ZikPhil Posted November 27, 2018 Author Share Posted November 27, 2018 47 minutes ago, johnnie.black said: If there's no data on the devices they should format as a pool after wiping both with: blkdiscard /dev/sdX Command ran but did not fix the situation. Quote Link to comment
limetech Posted November 27, 2018 Share Posted November 27, 2018 1 hour ago, ZikPhil said: Thank you. That resolved the issue with 1 cache drive. But as soon as I attached the other drive, it goes back to the 'No File System' issue. What do you mean by "attached the other drive"? When you increased the cache slot count from 1 to 2, did the second device show up already as 'assigned' and you didn't actually have to select from the dropdown? Quote Link to comment
JorgeB Posted November 27, 2018 Share Posted November 27, 2018 Also please post new diags after attempting to format the pool, there was recently a case where Unraid was trying to format a pool with XFS. Quote Link to comment
ZikPhil Posted November 27, 2018 Author Share Posted November 27, 2018 (edited) 23 minutes ago, limetech said: What do you mean by "attached the other drive"? When you increased the cache slot count from 1 to 2, did the second device show up already as 'assigned' and you didn't actually have to select from the dropdown? The second output is when 1 Cache Drive working properly in the raid setup. The first output is when I increased the cache size to 2 and selected my drive from the dropdown. Edited November 27, 2018 by ZikPhil Quote Link to comment
ZikPhil Posted November 27, 2018 Author Share Posted November 27, 2018 Here is the video of the attempt and the diagnostic after. jarvis-diagnostics-20181127-1551.zip Quote Link to comment
limetech Posted November 27, 2018 Share Posted November 27, 2018 In this state, please post output of: btrfs fi show afb81ffe-a859-49e9-b7ec-9bf15c6a167b Quote Link to comment
JorgeB Posted November 28, 2018 Share Posted November 28, 2018 (edited) I can't see why it's not working but I do see something I believe it's not right, maybe a clue for Tom, after adding the 2nd device to the pool and starting the array syslog shows: Nov 27 15:50:12 Jarvis emhttpd: cache TotDevices: 0 Nov 27 15:50:12 Jarvis emhttpd: cache NumDevices: 2 Nov 27 15:50:12 Jarvis emhttpd: cache NumFound: 0 Nov 27 15:50:12 Jarvis emhttpd: cache NumMissing: 0 Nov 27 15:50:12 Jarvis emhttpd: cache NumMisplaced: 0 Nov 27 15:50:12 Jarvis emhttpd: cache NumExtra: 2 Nov 27 15:50:12 Jarvis emhttpd: cache LuksState: 0 I believe it should be: Nov 27 15:50:12 Jarvis emhttpd: cache TotDevices: 1 Nov 27 15:50:12 Jarvis emhttpd: cache NumDevices: 2 Nov 27 15:50:12 Jarvis emhttpd: cache NumFound: 1 Nov 27 15:50:12 Jarvis emhttpd: cache NumMissing: 0 Nov 27 15:50:12 Jarvis emhttpd: cache NumMisplaced: 0 Nov 27 15:50:12 Jarvis emhttpd: cache NumExtra: 1 Nov 27 15:50:12 Jarvis emhttpd: cache LuksState: 0 Edited November 28, 2018 by johnnie.black Quote Link to comment
limetech Posted November 28, 2018 Share Posted November 28, 2018 28 minutes ago, johnnie.black said: I believe it should be: You believe correctly. My suspicion is that linux 'blkid' subsystem is foobar'ed for some reason and after OP reports back, probably a reboot will solve the issue (though there probably exists a bug that got him in this state to begin with). btrfs tools rely heavily on blkid subsystem and I remember when we first integrated multi-device btrfs cache pool, blkid was quite 'buggy' at the time. Quote Link to comment
ZikPhil Posted November 28, 2018 Author Share Posted November 28, 2018 I performed several reboots since then without much luck. root@Jarvis:~# btrfs fi show afb81ffe-a859-49e9-b7ec-9bf15c6a167b ERROR: cannot scan /dev/sdh1: Input/output error Label: none uuid: afb81ffe-a859-49e9-b7ec-9bf15c6a167b Total devices 1 FS bytes used 8.97GiB devid 1 size 931.51GiB used 10.02GiB path /dev/sdi1 Quote Link to comment
ZikPhil Posted November 28, 2018 Author Share Posted November 28, 2018 3 hours ago, limetech said: You believe correctly. My suspicion is that linux 'blkid' subsystem is foobar'ed for some reason and after OP reports back, probably a reboot will solve the issue (though there probably exists a bug that got him in this state to begin with). btrfs tools rely heavily on blkid subsystem and I remember when we first integrated multi-device btrfs cache pool, blkid was quite 'buggy' at the time. If you want to provide a public ssh key, I can open up the machine for you to connect into. Quote Link to comment
limetech Posted November 28, 2018 Share Posted November 28, 2018 Ok I see what's happening. The easiest/fastest way for you solve this now is to reassign your parity device: instead of assigning to Parity, assign to Parity 2. Alternately, if feasible, empty Disk 1 and reformat it with an xfs file system. The problem is this line in 'btrfs fi show' output: ERROR: cannot scan /dev/sdh1: Input/output error Recall that each Parity block is the XOR of each corresponding Data block. You have a single btrfs disk (Disk 1) and two xfs disks (Disk 2 and 3). Apparently the XOR of the block that contains the btrfs signature is present on the Parity device. Thus 'btrfs fi show' tries to figure out if this device happens to be part of the pool, and in that process tries to access a block outside the logical block range (because Parity is not a real btrfs device) so it generates that error message. But then the Unraid code that tries to determine the state of the pool sees that ERROR and bails out, marking the pool as 'unmountable'. Quote Link to comment
ZikPhil Posted November 28, 2018 Author Share Posted November 28, 2018 I will try that today, out of curiosity. Is it better to have everything as xfs? It seems to automatically go in btrfs Quote Link to comment
itimpi Posted November 28, 2018 Share Posted November 28, 2018 5 minutes ago, ZikPhil said: I will try that today, out of curiosity. Is it better to have everything as xfs? It seems to automatically go in btrfs You can use XFS or BTRFS for the cache drive if you only have a single drive in the cache. The moment you want multiple drives in the cache then BTRFS is the only option. Quote Link to comment
limetech Posted November 28, 2018 Share Posted November 28, 2018 2 hours ago, itimpi said: You can use XFS or BTRFS for the cache drive if you only have a single drive in the cache. The moment you want multiple drives in the cache then BTRFS is the only option. Correct. And the default for parity-protected array data disks is xfs. Quote Link to comment
ZikPhil Posted November 30, 2018 Author Share Posted November 30, 2018 Hey guys, sorry it took several days to transfer the files, re-format to xfs, put them back on and wait for parity to rebuild from scratch. After that was done successfully, I re-added the second Cache SSD and it worked!! :) Thank you all for your help. Quote Link to comment
trurl Posted December 3, 2018 Share Posted December 3, 2018 I just now saw this thread (stupid SPAM blocker, I didn't get the email to approve your account). Maybe I missed it, but I didn't see anybody trying to dissuade you from using an SSD in your array. SSDs in the array cannot be trimmed, can only write at the speed of parity, and there has been some discussion of whether or not it could even break parity. Doesn't look like you desperately need the added capacity so I would recommend removing it from the array. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.