Sparkum Posted March 11, 2017 Share Posted March 11, 2017 Hey all. So a while back I had a disk failure, I put a new disk in and parity rebuild started, it was moving really slow and by morning the new disk died as well (second parity disk never kicked in I simply lost the data) So now I am attempting to recover the data, does anyone think this would be possible? I was referred here https://ubuntuforums.org/showthread.php?t=1245536&p=7822694#post7822694 But honestly, I'm no less confused, could anyone add some insight. Thanks! Quote Link to comment
Squid Posted March 11, 2017 Share Posted March 11, 2017 Not going to comment on the recovery process, but I will state that actual disk failures are pretty rare, and based upon your description its more likely to be a cabling or power issue. Sans diagnostics though its impossible to tell... Quote Link to comment
JorgeB Posted March 11, 2017 Share Posted March 11, 2017 Also the new disk failing during rebuild wouldn't make you lose data, you'd just need to rebuild it again or copy the data from the emulated disk. Quote Link to comment
Sparkum Posted March 12, 2017 Author Share Posted March 12, 2017 57 minutes ago, Squid said: Not going to comment on the recovery process, but I will state that actual disk failures are pretty rare, and based upon your description its more likely to be a cabling or power issue. Sans diagnostics though its impossible to tell... These were my results when trying to mount it with unassigned disks after. Feb 26 13:47:28 Mount of '/dev/sdq1' failed. Error message: mount: wrong fs type, bad option, bad superblock on /dev/sdq1, Quote Link to comment
Sparkum Posted March 12, 2017 Author Share Posted March 12, 2017 33 minutes ago, johnnie.black said: Also the new disk failing during rebuild wouldn't make you lose data, you'd just need to rebuild it again or copy the data from the emulated disk. Sorry, to be more specific what it did after the second did was it emulated the ~180GB that the parity had recovered. Quote Link to comment
JorgeB Posted March 12, 2017 Share Posted March 12, 2017 (edited) 9 hours ago, Sparkum said: Sorry, to be more specific what it did after the second did was it emulated the ~180GB that the parity had recovered. This doesn't make sense to me, if the rebuild failed removing/unassigning that disk and starting the array would emulate all data on the missing disk, if it doesn't you have other problems. Also, like Squid already said, if you want help post your diagnostics. Edited March 12, 2017 by johnnie.black Quote Link to comment
trurl Posted March 12, 2017 Share Posted March 12, 2017 12 hours ago, Sparkum said: (second parity disk never kicked in I simply lost the data) This doesn't makes sense either. And as mentioned, you should be able to read the disk even with it removed and even if you only have single parity. The link you gave seems to be about recovering from filesystem corruption. If that is your problem, rebuilding the disk is not going to help no matter how many disks you try. You need to fix the filesystem. Post a diagnostic. Quote Link to comment
Sparkum Posted March 12, 2017 Author Share Posted March 12, 2017 8 hours ago, johnnie.black said: This doesn't make sense to me, if the rebuild failed removing/unassigning that disk and starting the array would emulate all data on the missing disk, if it doesn't you have other problems. Also, like Squid already said, if you want help post your diagnostics. I cant imagine any logs I can provide are going to be of any use. This happened near the beginning of February, and then was noticed probably about 5 days later when people started asking me where a bunch of stuff went. Media missing, files from my wifes shares gone etc. I started diagnosing near the end of Febuary, kinda failed and am giving it another shot now. Additionally I turned my desktop into an Unraid machine last night, so new cords, no raid card mobo connection only, etc and installed unassigned disks and same results with both disks. My array is fully rebuilt and has been for a month, its just a matter of recovery now. Quote Link to comment
trurl Posted March 12, 2017 Share Posted March 12, 2017 1 hour ago, Sparkum said: This happened near the beginning of February, and then was noticed probably about 5 days later when people started asking me where a bunch of stuff went. Sounds like you really, really needed Notifications setup but you hadn't done it. Do you have Notifications setup now? Quote Link to comment
Sparkum Posted March 12, 2017 Author Share Posted March 12, 2017 28 minutes ago, trurl said: Sounds like you really, really needed Notifications setup but you hadn't done it. Do you have Notifications setup now? No sorry, I knew every second of the way, I didnt know for a couple days that there was lost data though. Drive kicked out - notified. New drive in and rebuild started Woke up to notification that new drive was kicked out. Put next new drive in rebuilt sucessfully Moved on with my life ~5 days go by and people start asking where stuff is. Month+ later not trying to figure out what was on those drive (and plan a backup computer) Quote Link to comment
trurl Posted March 12, 2017 Share Posted March 12, 2017 So is that rebuilt disk still in the computer? If so, have you checked it for filesystem corruption? A rebuild won't fix filesystem corruption, and filesystem corruption won't prevent a successful rebuild. So it seems to me you may still have issues you don't know about. Quote Link to comment
Sparkum Posted March 12, 2017 Author Share Posted March 12, 2017 3 minutes ago, trurl said: So is that rebuilt disk still in the computer? If so, have you checked it for filesystem corruption? A rebuild won't fix filesystem corruption, and filesystem corruption won't prevent a successful rebuild. So it seems to me you may still have issues you don't know about. The disks in question are removed, the parity is rebuilt with replacement drives and working perfectly. The disks have been placed back into my computer AND another computer and added with unassigned disks plugin but I have been unable to mount them. As I type this though, with what you said, there is someone additional I can do when I mount into maintenance mode inst there? A way to skip corruption or something like that? Quote Link to comment
trurl Posted March 12, 2017 Share Posted March 12, 2017 https://lime-technology.com/wiki/index.php/Check_Disk_Filesystems Quote Link to comment
Sparkum Posted March 13, 2017 Author Share Posted March 13, 2017 7 hours ago, trurl said: https://lime-technology.com/wiki/index.php/Check_Disk_Filesystems Thanks for this. So initially I ran xfs_repair -nv /dev/sdm To which I was told Phase 1 - find and verify superblock... bad primary superblock - bad magic number !!! attempting to find secondary superblock... ............................................................................................................................................................................................................................................................................................................................................................................................................................Sorry, could not find valid secondary superblock Exiting now. followed by reiserfsck --rebuild-sb /dev/sdm Went through all the questions the first time (didnt copy and paste that) But the following times that I run it now I am greeted with reiserfsck 3.6.24 Will check superblock and rebuild it if needed Will put log info to 'stdout' Do you want to run this program?[N/Yes] (note need to type Yes if you do):Yes Reiserfs super block in block 16 on 0x8c0 of format 3.6 with standard journal Count of blocks on the device: 488378640 Number of bitmaps: 14905 Blocksize: 4096 Free blocks (count of blocks - used [journal, bitmaps, data, reserved] blocks): 0 Root block: 0 Filesystem is NOT clean Tree height: 0 Hash function used to sort names: not set Objectid map size 0, max 972 Journal parameters: Device [0x0] Magic [0x0] Size 8193 blocks (including 1 for journal header) (first block 18) Max transaction length 1024 blocks Max batch size 900 blocks Max commit age 30 Blocks reserved by journal: 0 Fs state field: 0x1: some corruptions exist. sb_version: 2 inode generation number: 0 UUID: c462cb69-dfa8-4482-8861-bf2588d00976 LABEL: Set flags in SB: Mount count: 1 Maximum mount count: 30 Last fsck run: Sun Mar 12 22:33:51 2017 Check interval in days: 180 Super block seems to be correct At this point I am kinda assuming SOL? Quote Link to comment
Sparkum Posted March 13, 2017 Author Share Posted March 13, 2017 Also just went through the same process with the second disk and received... root@Tower:/mnt/disks# reiserfsck --fix-fixable /dev/sdm reiserfsck 3.6.24 Will check consistency of the filesystem on /dev/sdm and will fix what can be fixed without --rebuild-tree Will put log info to 'stdout' Do you want to run this program?[N/Yes] (note need to type Yes if you do):Yes ########### reiserfsck --fix-fixable started at Sun Mar 12 22:58:43 2017 ########### Replaying journal: The problem has occurred looks like a hardware problem. If you have bad blocks, we advise you to get a new hard drive, because once you get one bad block that the disk drive internals cannot hide from your sight,the chances of getting more are generally said to become much higher (precise statistics are unknown to us), and this disk drive is probably not expensive enough for you to you to risk your time and data on it. If you don't want to follow that follow that advice then if you have just a few bad blocks, try writing to the bad blocks and see if the drive remaps the bad blocks (that means it takes a block it has in reserve and allocates it for use for of that block number). If it cannot remap the block, use badblock option (-B) with reiserfs utils to handle this block correctly. So that disk is pretty black and white. Quote Link to comment
JorgeB Posted March 13, 2017 Share Posted March 13, 2017 2 hours ago, Sparkum said: xfs_repair -nv /dev/sdm You need to specify the partition, eg: xfs_repair -nv /dev/sdm1 Quote Link to comment
Sparkum Posted March 13, 2017 Author Share Posted March 13, 2017 6 hours ago, johnnie.black said: You need to specify the partition, eg: xfs_repair -nv /dev/sdm1 Gah =/ Thanks! I'll try again tonight! Quote Link to comment
RobJ Posted March 13, 2017 Share Posted March 13, 2017 (edited) Sparkum, it's really important you use the exactly correct command, and it's really important you know exactly what the file system is that you should fix, because if you use the wrong one, you can damage things even worse. All 3 commands you tried above were wrong! This is not the place for trial and error methods. Please, take the time to read the wiki page more thoroughly! Edited March 13, 2017 by RobJ correction 1 Quote Link to comment
JorgeB Posted March 13, 2017 Share Posted March 13, 2017 (edited) 1 hour ago, RobJ said: All 3 commands you tried above were wrong! True that, I didn't even look carefully at the other commands used, only at the first, ideally you know which filesystem it is but if not run both in read only mode, xfs_repair was correct, but with reiser use reiserfsck --check only at first. Edited March 13, 2017 by johnnie.black Quote Link to comment
Sparkum Posted March 13, 2017 Author Share Posted March 13, 2017 Alright thanks guys, Sorry ya I definitely skimmed the page, I'll read it in full before trying again tonight. And thanks again for mentioning the partition thing, I guess I assumed (my bad for not reading) that I was simply doing it on the disk not the partition. Hopefully I didn't make my problem worse, but atleast I can say its because of me and not Unraids product, Quote Link to comment
Sparkum Posted March 14, 2017 Author Share Posted March 14, 2017 Well I went at it again tonight, so the first drive to fail is completely dead. Essentially I keep getting the message that there is a hardware problem, alright drive died as expected, it was old. The second drive, I was able to get going once I started using sdm1, got it mounted, however its only showing 1GB of data, It is what it is, 99% of what was on that disk wasnt something to worry about, .5% that truly mattered I have in multiple places, and the remaining .5% just kinda sucks. I honestly dont dont know if I can 100% say anything bad about Unraid, all I know is days after it all happened files were not there, from multiple shares, so that was my logical assumption. I know everyone said thats impossible but in my mind it all makes sense. This is not my first drive failure (or kick out rather) and everytime unraid has worked perfectly, Thanks all for your help! I atleast got the answers I wanted. Quote Link to comment
RobJ Posted March 14, 2017 Share Posted March 14, 2017 Are you absolutely certain that both drives were formatted with XFS? I noticed that you used both xfs_repair and reiserfsck, as if you were not sure which file system was in use. Using the wrong tool can both cause additional damage, and result in errors that may look like hardware errors. You need to find previous evidence of the actual file systems, like older syslogs or notes or screen captures, that indicate the correct file system for each. Once you know which one it is (ReiserFS or XFS), then you can retry on both with the correct tool. And if it is ReiserFS, then you should try the --rebuild-tree option with the scan whole partition option (-S), searches the entire partition for files and folders. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.