snsumner Posted March 11, 2011 Share Posted March 11, 2011 Hello, I'm freaking out my HDD used by the parity drive was starting to fail so I installed a new HDD to replace it. When I ran the parity sync one of my drive (disk4) started generating a bunch of errors so I shutdown the system to figure out what was happening. From investigating the syslog there is something wrong withy disk4. It passed all the smartctl but according to the syslog it can't mount the REISERFS. I'm getting this message: Mar 10 17:34:38 sumner-nas logger: mount: wrong fs type, bad option, bad superblock on /dev/md4, Mar 10 17:34:38 sumner-nas logger: missing codepage or helper program, or other error Mar 10 17:34:38 sumner-nas logger: In some cases useful info is found in syslog - try Mar 10 17:34:38 sumner-nas logger: dmesg | tail or so Mar 10 17:34:38 sumner-nas logger: Mar 10 17:34:38 sumner-nas emhttp: _shcmd: shcmd (26): exit status: 32 Mar 10 17:34:38 sumner-nas emhttp: disk4 mount error: 32 Mar 10 17:34:38 sumner-nas emhttp: shcmd (27): rmdir /mnt/disk4 Mar 10 17:34:38 sumner-nas kernel: REISERFS warning (device md4): sh-2006 read_super_block: bread failed (dev md4, block 2, size 4096) Mar 10 17:34:38 sumner-nas kernel: REISERFS warning (device md4): sh-2006 read_super_block: bread failed (dev md4, block 16, size 4096) Mar 10 17:34:38 sumner-nas kernel: REISERFS warning (device md4): sh-2021 reiserfs_fill_super: can not find reiserfs on md4 Does anyone know what the I should do in this situation when I can't mount the REISERFS? I've attached my syslog from the previous reboot which hit the the same problem. syslog-2011-03-10.txt Link to comment
snsumner Posted March 11, 2011 Author Share Posted March 11, 2011 BTW, I followed the instruction to check the file system from here: http://lime-technology.com/wiki/index.php?title=Check_Disk_Filesystems And this is the output I got: root@sumner-nas:~# umount /dev/md4 umount: /dev/md4: not mounted root@sumner-nas:~# reiserfsck --check /dev/md4 reiserfsck 3.6.21 (2009 www.namesys.com) ************************************************************* ** If you are using the latest reiserfsprogs and it fails ** ** please email bug reports to [email protected], ** ** providing as much information as possible -- your ** ** hardware, kernel, patches, settings, all reiserfsck ** ** messages (including version), the reiserfsck logfile, ** ** check the syslog file for any related information. ** ** If you would like advice on using this program, support ** ** is available for $25 at www.namesys.com/support.html. ** ************************************************************* Will read-only check consistency of the filesystem on /dev/md4 Will put log info to 'stdout' Do you want to run this program?[N/Yes] (note need to type Yes if you do):Yes The problem has occurred looks like a hardware problem. If you have bad blocks, we advise you to get a new hard drive, because once you get one bad block that the disk drive internals cannot hide from your sight,the chances of getting more are generally said to become much higher (precise statistics are unknown to us), and this disk drive is probably not expensive enough for you to you to risk your time and data on it. If you don't want to follow that follow that advice then if you have just a few bad blocks, try writing to the bad blocks and see if the drive remaps the bad blocks (that means it takes a block it has in reserve and allocates it for use for of that block number). If it cannot remap the block, use badblock option (-B) with reiserfs utils to handle this block correctly. bread: Cannot read the block (2): (Input/output error). Aborted Link to comment
Joe L. Posted March 11, 2011 Share Posted March 11, 2011 The disk is apparently not readable. Try power cycling it. (removing power, then re-applying power, then trying the reiserfsck once more) Joe L. Link to comment
snsumner Posted March 11, 2011 Author Share Posted March 11, 2011 The disk is apparently not readable. Try power cycling it. (removing power, then re-applying power, then trying the reiserfsck once more) Joe L. I tried power cycling the unraid server and now it hanging at the BIOS screen. The second HDD busy light is flashing endlessly, none of the other HDD are flaashing. I took out each individual HDD (six total) and put them in another server and ran the DOS boot version of Seagate Tool (short test) and they all passed. This all begin when I noticed slow I/O performance on my parity drive: http://lime-technology.com/forum/index.php?topic=11548.0 I ordered a replacement drive and replaced the parity drive so I could ship it back to Seagate for warranty repair. I run daily smart reports and all my HDD and all the others appeared to be healthy. The parity drive had a high number of bad sectors which is why I was being proactive and swapping it out. When I activated the parity sync I immediately got tons of errors on disk 4. About 7% into the parity sync I stopped the array and rebooted the system. On the reboot the /dev/md4 couldn't mount and now my configuration is invalid. This might be some type of hardware issue with my motherboard or controllers. I might have to swap out the motherboard and controller to resolve this problem. Either way I'm very concerned about losing data since I have many important files on my unraid system. I will continue to figure out how to get the system to reboot otherwise I will replace the motherboard and maybe controllers to see if that will resolve my problem. Any suggestions are greatly appreciated. Link to comment
Joe L. Posted March 11, 2011 Share Posted March 11, 2011 The disk is apparently not readable. Try power cycling it. (removing power, then re-applying power, then trying the reiserfsck once more) Joe L. I tried power cycling the unraid server and now it hanging at the BIOS screen. The second HDD busy light is flashing endlessly, none of the other HDD are flaashing. I took out each individual HDD (six total) and put them in another server and ran the DOS boot version of Seagate Tool (short test) and they all passed. This all begin when I noticed slow I/O performance on my parity drive: http://lime-technology.com/forum/index.php?topic=11548.0 I ordered a replacement drive and replaced the parity drive so I could ship it back to Seagate for warranty repair. I run daily smart reports and all my HDD and all the others appeared to be healthy. The parity drive had a high number of bad sectors which is why I was being proactive and swapping it out. When I activated the parity sync I immediately got tons of errors on disk 4. About 7% into the parity sync I stopped the array and rebooted the system. On the reboot the /dev/md4 couldn't mount and now my configuration is invalid. This might be some type of hardware issue with my motherboard or controllers. I might have to swap out the motherboard and controller to resolve this problem. Either way I'm very concerned about losing data since I have many important files on my unraid system. I will continue to figure out how to get the system to reboot otherwise I will replace the motherboard and maybe controllers to see if that will resolve my problem. Any suggestions are greatly appreciated. are you sure the flash drive is still selected as the boot device? Link to comment
lionelhutz Posted March 11, 2011 Share Posted March 11, 2011 My first advice is to not expect unRAID to be a backup for your files. Setup another off-site backup system. Try Crashplan backing up to another computer or online storage. Peter Link to comment
dgaschk Posted March 11, 2011 Share Posted March 11, 2011 I use unRAID as a backup server. If I loose my backups they are very easy to replace. unRAID is highly reliable when built and operated properly. But no storage system should not be relied upon to protect the single copy of important data. Link to comment
snsumner Posted March 11, 2011 Author Share Posted March 11, 2011 are you sure the flash drive is still selected as the boot device? Yes, I got the system to reboot fine, not sure what was causing the problem. The bad news is that disk4 /dev/md4 still refuses to mount. As I stated before I ran SeaTools on it short test and it passed. I also make sure all my cables and controller are tightly connected. And disk4 was detected by the unraid computer it just can't mount the volume! I'm going to run the long test overnight on disk4 and see if it reports any problems. I attached a zip file which contains screenshots within UnRAID interface of the problem. Screenshot of my device options before and after I swapped my parity drive. I included output of the smart test I ran on the drive. I included output of my attempt to run reiserfsck which failed. I included my syslog from my last reboot. I ran fdisk /dev/sdc -l and got the following output: root@sumner-nas:~# fdisk /dev/sdc -l Disk /dev/sdc: 1500.3 GB, 1500301910016 bytes 1 heads, 63 sectors/track, 46512336 cylinders Units = cylinders of 63 * 512 = 32256 bytes Disk identifier: 0x00000000 Device Boot Start End Blocks Id System /dev/sdc1 2 46512336 1465138552+ 83 Linux Partition 1 does not end on cylinder boundary. root@sumner-nas:~# All my data appears to be in tack on disk1, disk2, disk3, disk5 but disk4 I cannot mount. Is there some recovery utility I can run to get whatever I can from this disk? The drive appears to be in working condition but something corrupted the reiser fs. Is there away to copy what is not corrupted to another disk? Is there anything I can do? Any help is greatly appreciated. diagnostic_files.zip Link to comment
snsumner Posted March 11, 2011 Author Share Posted March 11, 2011 So I found this thread where somebody had a very similar problem: http://lime-technology.com/forum/index.php?topic=4021.msg35622;topicseen#msg35622 They ran --rebuild-tree and it fixed their problem. I think I'm at the point where I need to do the same thing but I want the powers to be to tell me whether I've exhausted all my other options. Please let me know if I should try running --rebuild-tree Thanks, Scott Link to comment
lionelhutz Posted March 11, 2011 Share Posted March 11, 2011 Yes, give it a try. Peter Link to comment
snsumner Posted March 11, 2011 Author Share Posted March 11, 2011 I tried running --rebuild-tree and got the same error: root@sumner-nas:~# reiserfsck --rebuild-tree /dev/md4 reiserfsck 3.6.21 (2009 www.namesys.com) ************************************************************* ** Do not run the program with --rebuild-tree unless ** ** something is broken and MAKE A BACKUP before using it. ** ** If you have bad sectors on a drive it is usually a bad ** ** idea to continue using it. Then you probably should get ** ** a working hard drive, copy the file system from the bad ** ** drive to the good one -- dd_rescue is a good tool for ** ** that -- and only then run this program. ** ** If you are using the latest reiserfsprogs and it fails ** ** please email bug reports to [email protected], ** ** providing as much information as possible -- your ** ** hardware, kernel, patches, settings, all reiserfsck ** ** messages (including version), the reiserfsck logfile, ** ** check the syslog file for any related information. ** ** If you would like advice on using this program, support ** ** is available for $25 at www.namesys.com/support.html. ** ************************************************************* Will rebuild the filesystem (/dev/md4) tree Will put log info to 'stdout' Do you want to run this program?[N/Yes] (note need to type Yes if you do):Yes The problem has occurred looks like a hardware problem. If you have bad blocks, we advise you to get a new hard drive, because once you get one bad block that the disk drive internals cannot hide from your sight,the chances of getting more are generally said to become much higher (precise statistics are unknown to us), and this disk drive is probably not expensive enough for you to you to risk your time and data on it. If you don't want to follow that follow that advice then if you have just a few bad blocks, try writing to the bad blocks and see if the drive remaps the bad blocks (that means it takes a block it has in reserve and allocates it for use for of that block number). If it cannot remap the block, use badblock option (-B) with reiserfs utils to handle this block correctly. bread: Cannot read the block (2): (Input/output error). Aborted Link to comment
snsumner Posted March 11, 2011 Author Share Posted March 11, 2011 I tried running --rebuild-sb and also got the same error: root@sumner-nas:~# reiserfsck --rebuild-sb /dev/md4 reiserfsck 3.6.21 (2009 www.namesys.com) ************************************************************* ** If you are using the latest reiserfsprogs and it fails ** ** please email bug reports to [email protected], ** ** providing as much information as possible -- your ** ** hardware, kernel, patches, settings, all reiserfsck ** ** messages (including version), the reiserfsck logfile, ** ** check the syslog file for any related information. ** ** If you would like advice on using this program, support ** ** is available for $25 at www.namesys.com/support.html. ** ************************************************************* Will check superblock and rebuild it if needed Will put log info to 'stdout' Do you want to run this program?[N/Yes] (note need to type Yes if you do):Yes The problem has occurred looks like a hardware problem. If you have bad blocks, we advise you to get a new hard drive, because once you get one bad block that the disk drive internals cannot hide from your sight,the chances of getting more are generally said to become much higher (precise statistics are unknown to us), and this disk drive is probably not expensive enough for you to you to risk your time and data on it. If you don't want to follow that follow that advice then if you have just a few bad blocks, try writing to the bad blocks and see if the drive remaps the bad blocks (that means it takes a block it has in reserve and allocates it for use for of that block number). If it cannot remap the block, use badblock option (-B) with reiserfs utils to handle this block correctly. bread: Cannot read the block (2): (Input/output error). Aborted Link to comment
lionelhutz Posted March 11, 2011 Share Posted March 11, 2011 You're missing the parity right now, correct? Can you get another drive to make a copy onto? Then, you can try rebuilding that drive. Your partition table looks correct. Here is my 1.5T drive root@MediaServer:~# fdisk /dev/sde -l Disk /dev/sde: 1500.3 GB, 1500301910016 bytes 1 heads, 63 sectors/track, 46512336 cylinders Units = cylinders of 63 * 512 = 32256 bytes Disk identifier: 0x00000000 Device Boot Start End Blocks Id System /dev/sde1 2 46512336 1465138552+ 83 Linux Partition 1 does not end on cylinder boundary. Peter Link to comment
snsumner Posted March 11, 2011 Author Share Posted March 11, 2011 I'm currently running badblocks in non-destructive mode: badblocks -nvs /dev/sdb Link to comment
lionelhutz Posted March 11, 2011 Share Posted March 11, 2011 It seems that will test for bad sectors. The Seatools should have pointed out bad sectors and if you just pull the SMART data from the disk it will indicate if there are bad sectors (they will be listed as pending reallocation or reallocated). There is definately something strange with that partition since reiserfsck seems to be having trouble with the superblock. But then it says it can't read the block so I guess that could mean any 4k block on the disk. Here's something to consider. It appears the first 4k block in the partition is the superblock. In the worst case, I would try writing 00's to the superblock and then repairing the partition again. This is where the extra copy disk would come in handy to experiment on. Here explains the reiserfs layout and where the superblock is. This is assuming that unRAID uses 4k blocks. http://homes.cerias.purdue.edu/~florian/reiser/reiserfs.php Peter Link to comment
dgaschk Posted March 11, 2011 Share Posted March 11, 2011 Can you replace the original parity drive and rebuild disk4? This should be a last resort if you can't revive disk4. Link to comment
snsumner Posted March 11, 2011 Author Share Posted March 11, 2011 It seems that will test for bad sectors. The Seatools should have pointed out bad sectors and if you just pull the SMART data from the disk it will indicate if there are bad sectors (they will be listed as pending reallocation or reallocated). There is definately something strange with that partition since reiserfsck seems to be having trouble with the superblock. But then it says it can't read the block so I guess that could mean any 4k block on the disk. Here's something to consider. It appears the first 4k block in the partition is the superblock. In the worst case, I would try writing 00's to the superblock and then repairing the partition again. This is where the extra copy disk would come in handy to experiment on. Here explains the reiserfs layout and where the superblock is. This is assuming that unRAID uses 4k blocks. http://homes.cerias.purdue.edu/~florian/reiser/reiserfs.php I have an extra disk how would I go about doing this? Is there any documentation explain how I could do this? Peter Link to comment
snsumner Posted March 11, 2011 Author Share Posted March 11, 2011 Can you replace the original parity drive and rebuild disk4? This should be a last resort if you can't revive disk4. My original parity drive is not usable. So I need to figure out how to recover the data on that drive. I do have an extra drive to copy data too, just don't know the steps I need to follow. Link to comment
snsumner Posted March 11, 2011 Author Share Posted March 11, 2011 So overnight I ran badblock and long smarttest and both completed with no errors! To troubleshoot further I moved the disk to another controller. I then went into UnRAID and assigned the drive back to disk4. Now unraid is telling me its a new disk and giving me the option: Start will bring the array on-line, start Data-Rebuild, and then expand the file system (if possible). I definetely know the my parity disk is not good so if I start Data-Rebuild I'm guessing I lose all the data on this disk for sure. Not sure what to do from here. Link to comment
Joe L. Posted March 11, 2011 Share Posted March 11, 2011 I move the disk associated with /dev/md4 to another SATA controller and assigned it as disk4 in UnRAID. Now unraid is telling me its a new disk and giving me the option: Start will bring the array on-line, start Data-Rebuild, and then expand the file system (if possible). Now sure what to do now. if you start the array, unRAID will attempt to write the re-constructed contents of diskk4 back to disk4. If you do not have good parity it will write garbage to the drive. My suggestion to evaluate what it will write is to un-assign disk4 for the moment, start the array without it. It will be simulated. You can then look around on it to see the contents. If it looks as expected, then it is ok to let those contents be re-construced onto disk4. You can then just stop the array, assign disk4, and start the array letting it re-construct the disk. Link to comment
snsumner Posted March 11, 2011 Author Share Posted March 11, 2011 If you do not have good parity it will write garbage to the drive. My suggestion to evaluate what it will write is to un-assign disk4 for the moment, start the array without it. It will be simulated. You can then look around on it to see the contents. If it looks as expected, then it is ok to let those contents be re-construced onto disk4. You can then just stop the array, assign disk4, and start the array letting it re-construct the disk. I'm not sure what you mean by it will be simulated (since the parity is bad how could it simulator disk4?). I have no problems starting up the array if I remove disk4. The parity drive is a brand new disk. I ran parity-sync and about 6% into it I noticed a huge amount of errors on disk4. So I stopped the parity-sync shutdown unraid, then disk4 would not mount again. I moving disk4 do another controller and I was able to successfully --rebuild-sb ran --check which told me to run --rebuild-tree which I'm currently running now. Assuming most of the files are intact after --rebuild-tree is complete how to I merge disk4 back into the array again? Or is this not the correct process to follow? Most of the data on disk4 is not super critical, mostly laptop image backups and movies. Things I can recover, however, I'd rather not have to go back and have to get those files again. Link to comment
lionelhutz Posted March 11, 2011 Share Posted March 11, 2011 Good, it sounds like it is rebuilding. Once done, add all the drives to the array in their proper slots. If unRAID does not say it will build parity then use the initconfig command and answer Yes. Then, refresh the browser and I believe it should appear with a blue parity and green data disks. FYI, the command; dd if=/dev/sdX of=/dev/sdY bs=1M conv=noerror will make an exact clone of the disk. sdX is the source and sdY is the destination. Peter Link to comment
snsumner Posted March 11, 2011 Author Share Posted March 11, 2011 Good, it sounds like it is rebuilding. Once done, add all the drives to the array in their proper slots. If unRAID does not say it will build parity then use the initconfig command and answer Yes. Then, refresh the browser and I believe it should appear with a blue parity and green data disks. FYI, the command; dd if=/dev/sdX of=/dev/sdY bs=1M conv=noerror will make an exact clone of the disk. sdX is the source and sdY is the destination. Peter Great news to report. I was able to mount the volume and there is no data lost!!! Just to be safe I'm going to make a copy of the volume to another disk. So my next question is what are the steps I need to follow to get disk4 incorporated back into the array? Remember my parity disk is not valid, so I need to figure out how to add disk4 back into the array then run a parity-sync. Anyone know what steps I should follow? Link to comment
lionelhutz Posted March 11, 2011 Share Posted March 11, 2011 Once done, add all the drives to the array in their proper slots. If unRAID does not say it will build parity then use the initconfig command and answer Yes. Then, refresh the browser and I believe it should appear with a blue parity and green data disks. Link to comment
snsumner Posted March 12, 2011 Author Share Posted March 12, 2011 Once done, add all the drives to the array in their proper slots. If unRAID does not say it will build parity then use the initconfig command and answer Yes. Then, refresh the browser and I believe it should appear with a blue parity and green data disks. Ok, I successfully recover disk4 and was able to mount and backup all files to another disk. I ran initconfig and answer Yes. It renamed super.dat to super.bak and created a new config. Now all my disks are showing up blue! Shouldn't disk1-disk5 be green and parity be blue? What should I do now, I'm afraid of losing data by starting parity-sync? I've attached a screenshot of what I'm seeing. I have all my data sitting on disk1-disk5 and I can mount them all successfully and they all passed reiserfsk --check. Link to comment
Recommended Posts
Archived
This topic is now archived and is closed to further replies.