December 12, 201510 yr I noticed one of my shares claimed that it was empty the other day. After a little digging I found entries in the logs that looked like a filesystem error on one disk. The disk itself was still accessible separately, but my shares were no longer looking to that disk for data. I'm running UnRaid 5.0.6 Plus and I read through the filesystem info here: http://lime-technology.com/wiki/index.php?title=Check_Disk_Filesystems#Drives_formatted_with_ReiserFS_using_unRAID_v5_or_later After I had run reiserfsck it's output instructed me to run rebuild-tree so I ran that. Just as it was reaching the end of it's run I wound up with the following: Pass 1 (will try to insert 554929 leaves): ####### Pass 1 ####### Looking for allocable blocks .. finished 0%....20%....40%....60%....80% left 101977, 64 /sec try_to_insert_pointer_to_leaf: bad search result Aborted (core dumped) I wasn't quite sure what to do next, but given that only one disk had been affected I thought I would try to take the array out of Maintenance Mode and bring the shares back online. Once I'd done that I found that UnRAID was now claiming that the disk that was affected wasn't formatted. Since I understand that reiserfsck writes to parity, I'm not in a position to replace the drive and let parity rebuild it. I thought it might be worthwhile to see if anyone has any advice for me before I risk doing any more harm.
December 13, 201510 yr I don't know why but reiserfsck (the tool that was supposed to fix all your problems!) has failed you. It appeared to have made a lot of progress, then crashed, with core dump! You'll need to reboot, if you haven't already. Since it did make progress, the next run should proceed farther (we hope!). Don't do anything else, but start in Maintenance Mode again, and rerun the --rebuild-tree. Whatever you do, DO NOT format that drive! It wouldn't be fatal, but certainly would not help. If you can't get the array to start in Maintenance Mode without formatting, then you may have to go the command line route.
December 13, 201510 yr How much memory do you have in your system? I was just wondering if reiserfsck ran out of memory. Also just checking that you are running the reiserfsck while the disk is not mounted in the array (which is best done by putting the system into maintenance mode).
December 13, 201510 yr If it is a RAM issue, then booting into Safe Mode might help as that will stop any plugins loading.
December 13, 201510 yr Author After my -rebuild-tree crashed I found that my drive is no longer attached to /dev/md2. I've re-run a check agains /dev/sdd1 and it also aborted indicating that I now have a bad root block 0 because the rebuild-tree didn't complete. I've stopped the array entirely now so I would think that my plugins aren't taking up RAM or causing me issues. I haven't tried rebooting yet. I realize that it won't write to parity but I've tried running the same checks against /dev/sdd1 but still wind up with the same results. It looks like the starting block for my partition has changed. Running a rebuild-sb indicates that there are errors but it seems successful. So far nothing has re-established that partition starting point to let me mount the disk. I do have a backup (albeit somewhat out of date) so I would love to get it mounted to see how much I care. I'm working now on getting another disk mounted to do a copy of the partition. I built my Unraid about a year ago with only 4GB of RAM. It was enough at the time and I've never seen any performance issues but perhaps I've outgrown it. I have 4x 3TB storage drives, 1x 3TB parity drive, and a 160GB cache drive for my plugins with about 4TB free overall.
December 13, 201510 yr 4GB of ram should be plenty for reiserfsck. However note that plugins take up RAM even when they are not running as they are loaded during the boot sequence. Booting in Safe mode stops them even being loaded Have you tried rebooting the system? The fact that md2 disappeared suggests that the disk might have dropped offline. Note that you can also unassign the disk, start the array in maintenance mode with the disk being emulated; and then run the reiserfsck against the emulated disk. If that works then the way forward is to rebuild the disk in question from the emulated disk.
December 13, 201510 yr Author I hadn't tried safe mode yet. That's next on my list. I took the opportunity to shut down the box, blow out the dust, and throw another 4GB of RAM into it. I just powered it up now to make sure that it recognized the RAM and now I'm bringing it back up in safe mode for another go at rebuild-tree. When I brought it back up after the RAM upgrade md2 was back as it should be. A reiserfsck --check still showed it beefing that the rebuild-tree didn't complete. It's a case of the cobbler's children having no shoes. I make sure my customers backups are always running smoothly and I've neglected my own. I had some issues with getting my backup NAS mounted after moving the network to a new house and just haven't gotten to fixing it. If my backups were more up-to-date, I'd just wipe the disk and let parity do it's job.
December 13, 201510 yr When I brought it back up after the RAM upgrade md2 was back as it should be. A reiserfsck --check still showed it beefing that the rebuild-tree didn't complete. I think that when you start with the --rebuild-tree option reiserfsck writes something to the disk. In my experience the --check option will now always fail until a --rebuild-tree run completes.
December 13, 201510 yr Author I think that when you start with the --rebuild-tree option reiserfsck writes something to the disk. In my experience the --check option will now always fail until a --rebuild-tree run completes. I got that impression when I was running the checks. It's running a rebuild-tree now having had it's RAM doubled to 8GB and unRAID booted in safe mode. I realize that breaks the cardinal rule of varying only one thing at a time but I think safe mode is where I should have started regardless of RAM level. I've just gotta wait for results now. With all of my storage offline I'm getting lots of work done on my network rack and equipment that I've been putting off for quite a while.
December 14, 201510 yr Author It took the better part of a day but the --rebuild-tree has finally completed. It looks to me like Safe Mode is the answer. I'll see if I can update the wiki as it advises using Maintenance Mode prior to running reiserfsck. I have encountered some other examples of information in the wiki that doesn't take into account those who might have plugins installed. As you can imagine, my lost & found directory is pretty large. Fortunately a good portion of it is folders so the contents are still intact. I do still have a lot of files to sift through to determine what they are and if they're still functional. Has anyone found a reliable method for doing that?
December 14, 201510 yr It took the better part of a day but the --rebuild-tree has finally completed. It looks to me like Safe Mode is the answer. I'll see if I can update the wiki as it advises using Maintenance Mode prior to running reiserfsck. I have encountered some other examples of information in the wiki that doesn't take into account those who might have plugins installed. In most cases running in Maintenance mode is sufficient. However I agree that updating the wiki article to suggest that you first boot into Safe mode before starting the array in Maintenance mode is a good idea as it will never do any harm As you can imagine, my lost & found directory is pretty large. Fortunately a good portion of it is folders so the contents are still intact. I do still have a lot of files to sift through to determine what they are and if they're still functional. Has anyone found a reliable method for doing that? Obviously for the folders you can normally determine from their contents what they should be. For files that are not in a folder the Linux 'files' command can be useful for determining the type of any particular file as they will have lost any file extension. However that does not tell you if the file is intact or is damaged in some way. The only way to do that is to either compare them against backups or have file checksums that they can be checked against.
December 14, 201510 yr Author There is a pattern to the file naming of recovered files/folders in the lost+found folder. I don't know where reiserfsck got it's numbers but it's very specific about what is/was in what folder. Each renamed file, while missing it's extension is identified in a <folder>_<file> format. The number in the <file> position is random, but all files that were recovered from the same folder will have the same number in the <folder> position. In my case I have some folders with readily identifiable content. Those folders are also named in the same format so by knowing where the content in those folders came from, I'll also have some information about where the randomly named files came from. For example, I have a folder named 5_8173 and 116 files named 5_7366, 5_7379, 5_7408, etc. Since the folder 5_8173 contains some eBooks, each of the files starting with 5_ are also eBooks and I can try renaming them to .txt, .mobi, or .epub. It's not a huge help, but it will save time identifying a lot of my files given that I know the files once had only a small subset of the possible extensions. Of course I have some files for which there weren't any recovered subfolders and I'll have to tinker with those. Fortunately, most of my impacted data hasn't changed since my last backup so I'm in pretty good shape. Thanks to everyone to helped me out. I really appreciate it!
Archived
This topic is now archived and is closed to further replies.