darrenyorston Posted November 4, 2017 Share Posted November 4, 2017 As I have posted elsewhere my PFSense VM has been pausing for two days now. I resume it however it pauses again shortly after. Someone suggested it might be a cache disk issue. I checked and unRAID has reserved 200GB og my cache. I have 2 x 120GB and 2 x 520GB SSDs as cache. They were less than 50% full, something like 38%. I shut down my server and adjusted a SATA power cable on one of the SSDs and now Fix Common Problems is reporting: cache (KINGSTON_SHFS37A120G_50026B7261075EF7)has file system errors (No file system (32)) If the disk if XFS / REISERFS, stop the array, restart the Array in Maintenance mode, and run the file system checks. If the disk is BTRFS, then just run the file system checksIf the disk is listed as being unmountable, and it has data on it, whatever you do do not hit the format button. Seek assistance HERE My SSD is BTRFS. I looked to run the File System Check however the option was grayed out with a message advising it was only available when the array was running in Maintenance mode. I restarted the array in Maintenance mode, selected File System Check, but it doesn't appear to be doing anything. The text box says "running" but its been like that for over 24hrs now. How long should it take to finish? I have attached diagnostics again. tower-diagnostics-20171104-1754.zip Link to comment
JorgeB Posted November 4, 2017 Share Posted November 4, 2017 btrfs fsck should only be run as a last resort, and it wouldn't take that long, it's probably hanged, see here to try and recover your data: https://forums.lime-technology.com/topic/46802-faq-for-unraid-v6/?do=findComment&comment=543490 Link to comment
Squid Posted November 4, 2017 Share Posted November 4, 2017 2 hours ago, johnnie.black said: btrfs fsck should only be run as a last resort, and it wouldn't take that long, it's probably hanged, see here to try and recover your data: https://forums.lime-technology.com/topic/46802-faq-for-unraid-v6/?do=findComment&comment=543490 Updated FCP to reflect this Link to comment
darrenyorston Posted November 4, 2017 Author Share Posted November 4, 2017 13 hours ago, johnnie.black said: btrfs fsck should only be run as a last resort, and it wouldn't take that long, it's probably hanged, see here to try and recover your data: https://forums.lime-technology.com/topic/46802-faq-for-unraid-v6/?do=findComment&comment=543490 Well I ran it because that's what Fix Common Problems said to do if the disk was unmountable. Having looked at your link, could you provide some guidance? I dont utilise command line functionality so what's the starting point? Link to comment
JorgeB Posted November 4, 2017 Share Posted November 4, 2017 7 minutes ago, darrenyorston said: I dont utilise command line functionality so what's the starting point? SSH into the server and start on option 1, all steps are there, ask if you have a doubt on a specific step so I can improve it. Link to comment
darrenyorston Posted November 4, 2017 Author Share Posted November 4, 2017 You say to replace x with the actual device. So its going to be "mkdir /sdi1"? or mkdir /dev/sdi1"? Leading to "mount -o recovery,ro /dev/sdX1 /dev/sdi1"? Link to comment
darrenyorston Posted November 5, 2017 Author Share Posted November 5, 2017 I have presumed it's "mkdir /x" leading to "mount -o recovery,ro /devsdi1 /x". No errors reported and I am at the command prompt. where do I find the mounted disk to copy the files from? Nothing appears to have changed on the GUI. Link to comment
remotevisitor Posted November 5, 2017 Share Posted November 5, 2017 You mounted the disk onto the directory /x, so the files will be in the directory /x. Link to comment
darrenyorston Posted November 5, 2017 Author Share Posted November 5, 2017 I have found the directories using MC. What disk number will the /x drive be for the purposes of an MC copy? I tried "cp -r /mnt/x /mnt/disk4" however nothing happens. When I look at disk 4 no files have been copied. Ok. I have discovered that the command line doesnt work. I just utilised the menus instead. Link to comment
darrenyorston Posted November 5, 2017 Author Share Posted November 5, 2017 I have been able to copy all the material off the drive. I formated the disk as per however the disk still shows that it is unmountable. Link to comment
darrenyorston Posted November 5, 2017 Author Share Posted November 5, 2017 I have attached the diagnostics file again. The drive won't format, it keeps saying its "Unmountable" in the GUI. A different cache disk, one I have not touched is now showing as a "New Device" This feels like a rat hole I have gone down. My initial problem was for a VM pausing, potentially due to a 200GB reserved space, I shut down the server and took the opportunity to adjust the path of a SATA cable, now I am in the situation that none of my Docker containers or VMs work. What is the best way to get to a functional system? tower-diagnostics-20171105-1139.zip Link to comment
darrenyorston Posted November 5, 2017 Author Share Posted November 5, 2017 I have been able to format the disk, I had to remove it from the cache first. Now I am trying to restore the files back to it however the cache disk (/x) is reporting "Read-only file system (30)" unRAID Cache disk message is also reporting: "Cache pool BTRFS too many profiles X" with X being the SSD I just re-added. Link to comment
JorgeB Posted November 5, 2017 Share Posted November 5, 2017 If you already copied everything important from you cache to the array or another place now it's best to completely format your cache pool to start over with a clean filesystem and then restore the data: This will delete all data on the cache SSDs. Stop the array and yype: blkdiscard /dev/sdX Replace X with each SSD identifier, one at a time. Then start the array and format the pool. Link to comment
darrenyorston Posted November 5, 2017 Author Share Posted November 5, 2017 No I have not copied everything of the cache. Why would I? The instructions I followed advised to copy the materials of one disk, not the entire cache. I am not feeling overly confident here. Seems the "fix" is worse than the problem. Atleast I had access to my files and docker. Now I have neither. Link to comment
JorgeB Posted November 5, 2017 Share Posted November 5, 2017 3 minutes ago, darrenyorston said: Why would I? The instructions I followed advised to copy the materials of one disk, not the entire cache. When you mount a device from a pool in recovery mode and if the mount is successful the whole pool will be mounted, you need to copy everything you want from the pool so it can be reformatted, this process is to recover the data, not to fix the pool, the pool needs to be recreated. Link to comment
darrenyorston Posted November 5, 2017 Author Share Posted November 5, 2017 Well in that case something didnt work. I copied all the directories of the mount and it totalled only 60GB, it was 300GB or so prior to mounting the drive. At the moment none of my docker containers work, the VMs tab is blank, and I cannot access shares on the array. Link to comment
JorgeB Posted November 5, 2017 Share Posted November 5, 2017 Where is the docker image? If it was on the cache it's probably corrupt and needs to be recreated, if you have your appdata you recreate all docker form the previous templates to retain all config options. Same for the VMs, look where libvirt.img was stored, restore from backup or recreate, if recreated you'll need to recreate your VMs, or restore the XMLs from backups, if you have them (or backups) you can re-use the old vdisks and not lose any data. For the array shares we need to see the diagnostics, it can file filesystem corruption on the array. Link to comment
darrenyorston Posted November 5, 2017 Author Share Posted November 5, 2017 When I browse the cahce folder from the gui the appdata and other folders are there. They are empty however. There is no libvirt.img. The only *.img file is docker.img. I have already uploaded the diagnositcs file. Link to comment
JorgeB Posted November 5, 2017 Share Posted November 5, 2017 I'm not understanding if currently you have a mounted cache or not, I need to see current diagnostics. Link to comment
JorgeB Posted November 5, 2017 Share Posted November 5, 2017 41 minutes ago, darrenyorston said: Well in that case something didnt work. I copied all the directories of the mount and it totalled only 60GB, it was 300GB or so prior to mounting the drive. If the mount works all pool data should be available, except if like mentioned in the FAQ: Quote Note that if there are more devices missing than the profile permits for redundancy it may still mount but there will be some data missing, e.g., mounting a 4 device raid1 pool with 2 devices missing will result in missing data. Edit: it's also possible you had multiple profiles on your cache and only one of them mounted, but you should get a warning about that if system notifications are enable. Link to comment
darrenyorston Posted November 5, 2017 Author Share Posted November 5, 2017 I posted that earlier: its reporting: "Cache pool BTRFS too many profiles X" with X being the SSD I just re-added. Link to comment
JorgeB Posted November 5, 2017 Share Posted November 5, 2017 You shouldn't have tried to add it back but try mounting the pool using one of the other SSDs and check if you can access different data. Link to comment
darrenyorston Posted November 5, 2017 Author Share Posted November 5, 2017 I followed your instructions here Link to comment
JorgeB Posted November 5, 2017 Share Posted November 5, 2017 31 minutes ago, darrenyorston said: I followed your instructions here Yes, and where does it say to add the device back? Like I said try the recovery mount again, but now using one of the other SSDs, if it doesn't mount with -o recovery,ro try the option below for when there's a missing device (-o degraded,recovery,ro) Link to comment
darrenyorston Posted November 5, 2017 Author Share Posted November 5, 2017 9 hours ago, johnnie.black said: Yes, and where does it say to add the device back? Like I said try the recovery mount again, but now using one of the other SSDs, if it doesn't mount with -o recovery,ro try the option below for when there's a missing device (-o degraded,recovery,ro) Here: If it mounts copy all the data from /x to other destination, like an array disk, you can use Midnight Command or your favorite tool, after all data is copied format the disk and restore data. The disc mounted and I was able to copy the data off the drive. The format of the disk though never worked. The GUi continued to report the disc required formatting. When I told it to format it would format for about 10sec or so then would show mountable again. So I removed the disc from cache and utilised unassigned devices to unmount it. I then utilised pre-clear disks to prepare the disc. I then re added the drive to the cache and it is working fine, or so it seems. I have had to delete the docker image and re add all the containers. The docker containers are not working correctly at the moment though as they dont seem to be able to see the internet. I also had to delete the VMs and start anew. I will probably look to move my data back onto Freenas now and just use unRAID for docker containers, Im concerned about unRAIDs stability. Removing and replacing a SATA cable shouldnt result in such problems. diagnostics.zip Link to comment
Recommended Posts
Archived
This topic is now archived and is closed to further replies.