March 15, 201511 yr After many years of trundling along with my unRAID system with nary an issue, I've finally encountered a problem. The sequence of events has been as follows: 1) after a parity check, I noticed a number of errors and that one of the disks was showing as disabled; 2) the disk checked out fine on SMART reports, etc, but I couldn't get it to add back to the array; 3) the syslog showed a number of errors in the file system of one of the disks (13, unlucky for some :-); 4) somewhat optimistically, I elected to replace the disk and rebuild, which *seemed* to work successfully (i.e. the replacement drive was successfully added to the array and a parity check was successfully run). However, this morning I noticed that one of the folders in one of my Shares is empty (there could be other instances of this, but I this is the only one I've found so far). The data still seems to exist in the equivalent folder on the relevant data disks, but the Share is empty. Oddly, the files (which are movies) are remain accessible via my XBMC library, but are not accessible as via the NFS mounts in XBMC (as you would expect, as this mounts the Shares). When I checked the webGUI, all drives are showing as valid and there had been a recent parity check with no errors, but the syslog showed file system errors in the replacement drive. I assume these were 'imported' onto this drive when it was rebuilt from the parity data. I have run reiserfsck on the drive with the following output: ************************** Replaying journal: Trans replayed: mountid 125, transid 85183, desc 4518, len 1, commit 4520, next trans offset 4503 Trans replayed: mountid 125, transid 85184, desc 4521, len 1, commit 4523, next trans offset 4506 Replaying journal: Done. Reiserfs journal '/dev/md13' in blocks [18..8211]: 2 transactions replayed Checking internal tree.. \/ 3 (of 32|/ 3 (of 170\/ 17 (of 85|block 646938627: The level of the node (57756) is not correct, (1) expected the problem in the internal node occured (646938627), whole subtree is skipped / 6 (of 32-/125 (of 168\block 446214386: The level of the node (5064) is not correct, (2) expected the problem in the internal node occured (446214386), whole subtree is skipped finished Comparing bitmaps..vpf-10640: The on-disk and the correct bitmaps differs. Bad nodes were found, Semantic pass skipped 2 found corruptions can be fixed only when running with --rebuild-tree ************************** I assume I should run reiserfsck --rebuild-tree on the drive but I'm concerned this will result in data loss on a drive that currently seems to be fully readable. I am currently endeavouring to copy the data from the drive to a back up location. Once this is done, I guess I should run the dreaded reiserfsck --rebuild-tree, but I thought I would consult the experts first. If this works, will my Share miraculously repopulate, or will I need to take other steps? Many thanks in advance.
March 15, 201511 yr After many years of trundling along with my unRAID system with nary an issue, I've finally encountered a problem. The sequence of events has been as follows: 1) after a parity check, I noticed a number of errors and that one of the disks was showing as disabled; 2) the disk checked out fine on SMART reports, etc, but I couldn't get it to add back to the array; 3) the syslog showed a number of errors in the file system of one of the disks (13, unlucky for some :-); 4) somewhat optimistically, I elected to replace the disk and rebuild, which *seemed* to work successfully (i.e. the replacement drive was successfully added to the array and a parity check was successfully run). However, this morning I noticed that one of the folders in one of my Shares is empty (there could be other instances of this, but I this is the only one I've found so far). The data still seems to exist in the equivalent folder on the relevant data disks, but the Share is empty. Oddly, the files (which are movies) are remain accessible via my XBMC library, but are not accessible as via the NFS mounts in XBMC (as you would expect, as this mounts the Shares). When I checked the webGUI, all drives are showing as valid and there had been a recent parity check with no errors, but the syslog showed file system errors in the replacement drive. I assume these were 'imported' onto this drive when it was rebuilt from the parity data. I have run reiserfsck on the drive with the following output: ************************** Replaying journal: Trans replayed: mountid 125, transid 85183, desc 4518, len 1, commit 4520, next trans offset 4503 Trans replayed: mountid 125, transid 85184, desc 4521, len 1, commit 4523, next trans offset 4506 Replaying journal: Done. Reiserfs journal '/dev/md13' in blocks [18..8211]: 2 transactions replayed Checking internal tree.. \/ 3 (of 32|/ 3 (of 170\/ 17 (of 85|block 646938627: The level of the node (57756) is not correct, (1) expected the problem in the internal node occured (646938627), whole subtree is skipped / 6 (of 32-/125 (of 168\block 446214386: The level of the node (5064) is not correct, (2) expected the problem in the internal node occured (446214386), whole subtree is skipped finished Comparing bitmaps..vpf-10640: The on-disk and the correct bitmaps differs. Bad nodes were found, Semantic pass skipped 2 found corruptions can be fixed only when running with --rebuild-tree ************************** I assume I should run reiserfsck --rebuild-tree on the drive but I'm concerned this will result in data loss on a drive that currently seems to be fully readable. I am currently endeavouring to copy the data from the drive to a back up location. Once this is done, I guess I should run the dreaded reiserfsck --rebuild-tree, but I thought I would consult the experts first. If this works, will my Share miraculously repopulate, or will I need to take other steps? Many thanks in advance. If you had file system corruption on a disk, then rebuilding it does not correct this error - that can only be done using reiserfsck. What the rebuild does is fix the 'disabled' status within unRAID that would have been caused by an earlier failed write. Backing up what data you can before attempting recovery makes a lot of sense as a precaution although it is rare to lose data using the --rebuild-tree option. In a worst case you might want to also use the --scan-whole-partition option that may find additional files (which it would put into the lost+found folder). However since this can also retrieve partial and deleted files so that you may have a lot of work to sort out what is relevant. Do you still have the old disk that was marked as disabled? If so then there is a good chance that running the reiserfsck against that can recover data.
March 16, 201511 yr Author Thanks for the response. It's really appreciated. I do still have the old disk, but presumably the replacement is identical (right down to the file system errors)? I will try to complete a full back up but I do not currently have space to back up a full 3TB drive, so there may be an element of finger crossing for the remaining files. Once I have run reiserfsck --rebuild-tree will this fix the issue of having depopulated Shares or would further steps be required to do this? Thanks again.
March 16, 201511 yr I do still have the old disk, but presumably the replacement is identical (right down to the file system errors)? Yes - they should be identical, but is nice to know that in emergency there is potentially another source for recovering data. Once I have run reiserfsck --rebuild-tree will this fix the issue of having depopulated Shares or would further steps be required to do this? Difficult to say. If the issue was caused by the file system corruption then they may just reappear. The other possibility is they reappear in the lost+found folder because the file was recovered, but the path leading to it was lost. The final possibility for recovery is to run reiserfsck with the --scan-whole-partition option that can recover deleted files (as long as they are not overwritten) as it reads through every sector on the disk looking for files.. However that tends to be undesirable unless the data is absolutely critical as it can also recover many file 'fragments' with only partial file contents and you are left trying to resolve which are good recovered files and which are not. This can be very time consuming so to be avoided unless absolutely necessary.
March 27, 201511 yr Author Ran reiserfsck --rebuild-tree and it worked! Drive file system repaired. About 70 files in the lost+found but all (eventually) identifiable. The shares repopulated automatically when I brought the array back online. Many thanks to itimpi for the help. I've marked this as solved.
Archived
This topic is now archived and is closed to further replies.