(Solved by itimpi ) - Failed data disk

thanhtran · May 23, 2015

Hi all,

Hoped some one can help me solve problem.

1 . Last week, I try to transfer file from usb stick to my unRAID server, in middle of process copy files, the unRAID just stopped copy. I logged in unRAID , I see disk # 4 is "unformatted" , I though drive # 4 is bad, and I replace it (2tb) with large drive 3tb.

2. I pre_clear 3tb with command ./pre_clear /dev/sdo without error

3. Bring array on line.

4. Format 3tb , seem can not format it.

5. Rebuld data disk # 4, but drive # 4 still show unformatted in FW (6.12 beta)

5. take array off line, seem can not take all drives to offline state. Last option was reboot from console.

6. Upgrade FW 6RC3

7. Bring array on line.

8. Format 3tb , seem can not format it.

9. Can not bring array to offline, both FW the same problem.

10. What can I do to solve problem ? Please help me out or give instruction what I have to do next.

11. Thanks for help

preclear_finish_W1F36MD3_2015-05-21.txt

thanhtran · May 23, 2015

add more picture

thanhtran · May 23, 2015

add more picture

itimpi · May 23, 2015

Note that a write failure can leave some file system corruption behind, and until that is resolved the disk will continue to show up as unmountable. Doing a rebuild onto another disk does not resolve such corruption.

Are the disks physically accessible now without throwing errors? If so then the way forward would be to put the array into maintenance mode and then from a telnet/console session run a command of the form:

reiserfsck --check /dev/md4

and see what that reports (it takes some time to run). That will at least see if corruption is in fact your issue, and give suggestions for resolving it if it is.

thanhtran · May 23, 2015

Note that a write failure can leave some file system corruption behind, and until that is resolved the disk will continue to show up as unmountable. Doing a rebuild onto another disk does not resolve such corruption.

Are the disks physically accessible now without throwing errors? If so then the way forward would be to put the array into maintenance mode and then from a telnet/console session run a command of the form:
reiserfsck --check /dev/md4
and see what that reports (it takes some time to run). That will at least see if corruption is in fact your issue, and give suggestions for resolving it if it is.

Thank itimpi , seem all disks access ok ( I just guess, because more than 14 disks and all were high water share, It is hard to tell ). Let's me mount disks and run command line as your suggestion.

thanhtran · May 23, 2015

Note that a write failure can leave some file system corruption behind, and until that is resolved the disk will continue to show up as unmountable. Doing a rebuild onto another disk does not resolve such corruption.

Are the disks physically accessible now without throwing errors? If so then the way forward would be to put the array into maintenance mode and then from a telnet/console session run a command of the form:
reiserfsck --check /dev/md4
and see what that reports (it takes some time to run). That will at least see if corruption is in fact your issue, and give suggestions for resolving it if it is.

Do I need to put 2tb old disk back at slot # 4 or do I still use new disk to check ?

thanhtran · May 23, 2015

Note that a write failure can leave some file system corruption behind, and until that is resolved the disk will continue to show up as unmountable. Doing a rebuild onto another disk does not resolve such corruption.

Are the disks physically accessible now without throwing errors? If so then the way forward would be to put the array into maintenance mode and then from a telnet/console session run a command of the form:
reiserfsck --check /dev/md4
and see what that reports (it takes some time to run). That will at least see if corruption is in fact your issue, and give suggestions for resolving it if it is.
Thank itimpi , seem all disks access ok ( I just guess, because more than 14 disks and all were high water share, It is hard to tell ). Let's me mount disks and run command line as your suggestion.

Please help me . I just finished run command line

reiserfsck --check /dev/md4

and I see the message as

root@hightower:~# reiserfsck --check /dev/md4
reiserfsck 3.6.24

Will read-only check consistency of the filesystem on /dev/md4
Will put log info to 'stdout'

Do you want to run this program?[N/Yes] (note need to type Yes if you do):Yes
###########
reiserfsck --check started at Sat May 23 06:43:17 2015
###########
Replaying journal: Done.
Reiserfs journal '/dev/md4' in blocks [18..8211]: 0 transactions replayed
Checking internal tree..  / 13 (of  15)/ 66 (of 144)/135 (of 160)block 224912409: The level of the node (0) is not correct, (1) expected
the problem in the internal node occuredfinished
Comparing bitmaps..vpf-10640: The on-disk and the correct bitmaps differs.
Bad nodes were found, Semantic pass skipped
1 found corruptions can be fixed only when running with --rebuild-tree
###########
reiserfsck finished at Sat May 23 06:53:28 2015
###########

what do I have do next ?

itimpi · May 23, 2015

Well that confirms that corruption is present.

You need to rerun the reiserfsck using the --rebuild-tree option instead of the --check option. This should recover everything with any luck although there is a faint chance a file being written at the time of failure may not get recovered properly.

It would probably have been possible to recover the original 2TB disk the same way. However since you have now put the larger disk in you should recover that one as unRAID provides no way to replace a larger disk with a smaller one. Keep the 2TB disk as a backup for emergency data recovery until you are happy with the results of running the reiserfsck against the current disk.

thanhtran · May 23, 2015

Well that confirms that corruption is present.

You need to rerun the reiserfsck using the --rebuild-tree option instead of the --check option. This should recover everything with any luck although there is a faint chance a file being written at the time of failure may not get recovered properly.

It would probably have been possible to recover the original 2TB disk the same way. However since you have now put the larger disk in you should recover that one as unRAID provides no way to replace a larger disk with a smaller one. Keep the 2TB disk as a backup for emergency data recovery until you are happy with the results of running the reiserfsck against the current disk.

Thank itimpi, without your help, I can not solve my problem.

(Solved by itimpi ) - Failed data disk

Recommended Posts

thanhtran

Link to comment

thanhtran

Link to comment

thanhtran

Link to comment

itimpi

Link to comment

thanhtran

Link to comment

thanhtran

Link to comment

thanhtran

Link to comment

itimpi

Link to comment

thanhtran

Link to comment

Join the conversation