philouza Posted August 18, 2017 Share Posted August 18, 2017 Running version 5.0.5. Got a red ball on one of my disks (PL1331LAGRTP6H) a week or so ago. Smart report and test came back ok, so figure it was the sata controller or the cable. Moved the drive to a different controller with a different cable and went through the unassign and then reassign the disk procedure to start rebuilding it. A few hours into the rebuild another disk (PL1331LAGSAUDH) red balled which seemed to have halted the rebuild. Unfortunately the smart report shows the drive is failing. I can still access the drive, however the data looks corrupt and I can only see ~12gb of the 3.7tb via samba. So as of now, my array has PL1331LAGRTP6H with an orange ball that was in the middle of rebuilding and now PL1331LAGSAUDH with a red ball that seems screwed. Have a new 6tb drive handy. Have attached smart reports for both drives and a syslog from last reboot. Sorry, should of saved the syslog when the second drive fell over. Currently have the array stopped and it's showing 'configuration valid'. Hope someone can help. Cheers. syslog PL1331LAGRTP6H.txt PL1331LAGSAUDH.txt Link to comment
JorgeB Posted August 19, 2017 Share Posted August 19, 2017 7 hours ago, philouza said: I can still access the drive, however the data looks corrupt and I can only see ~12gb of the 3.7tb via samba. That is normal since you have 2 invalid disks with single parity, unRAID can't correctly emulate the missing data. Assuming disk12 data is unchanged since it first became disable, you can do this, but it's possible disk12 will have some corruption, though very little, because it stopped during the rebuild: -Utils -> New Config -re-assign all disks, double check parity is the parity slot-check "parity is already valid" before starting the array -start the array Now check if data on disks 11 and 12 looks OK. Link to comment
philouza Posted August 20, 2017 Author Share Posted August 20, 2017 On 19/08/2017 at 6:30 PM, johnnie.black said: Now check if data on disks 11 and 12 looks OK Thanks so much for the reply. Did as you instructed and the array came back up. Kudos. Started a parity check and disk 11 eventually red balled again killing the check. Tons of write errors, so confident the drive is shot. See syslog _disk11_fail attached. Replaced with a new 6tb and currently rebuilding which seems to be going ok with the exception of these errors in the current syslog.. Aug 20 19:22:00 Harvey kernel: REISERFS warning: reiserfs-5090 is_tree_node: node level 24537 does not match to the expected one 1 Aug 20 19:22:00 Harvey kernel: REISERFS error (device md11): vs-5150 search_by_key: invalid format found in block 345831560. Fsck? Aug 20 19:22:00 Harvey kernel: REISERFS error (device md11): vs-13070 reiserfs_read_locked_inode: i/o failure occurred trying to find stat data of [587 1500 0x0 SD] Attached that syslog below too. Assuming I need to wait till the rebuild is finished, would I then run a reiserfsck --check against disk 11, or should I kick off a parity check first and go from there? syslog_disk11_fail syslog Link to comment
JorgeB Posted August 20, 2017 Share Posted August 20, 2017 1 minute ago, philouza said: Assuming I need to wait till the rebuild is finished, would I then run a reiserfsck --check against disk 11, Once the rebuild finishes run reiserfsck, there may also be some file corruption because we made parity valid when it really wasn't, but with 2 invalid disks it was your best option, keep old disk11 intact, you may still be able to copy some/most data if needed. Consider upgrading to unRAID v6 and using dual parity, IMO it's recommended for your array size. Link to comment
philouza Posted August 21, 2017 Author Share Posted August 21, 2017 Ok ran reiserfsck --check /dev/md11 and got the following... Comparing bitmaps..vpf-10640: The on-disk and the correct bitmaps differs. Bad nodes were found, Semantic pass skipped 50 found corruptions can be fixed only when running with --rebuild-tree Am I good just running reiserfsck --rebuild-tree /dev/md11 or should I add any options like '-S' or '--scan-whole-partition'? Link to comment
philouza Posted August 21, 2017 Author Share Posted August 21, 2017 Cheers mate. You are such a superman on this forum. Saving us citizens daily. Link to comment
philouza Posted August 21, 2017 Author Share Posted August 21, 2017 Normal to not see your files during the rebuild-tree? All my shares (NFS and Samba) are empty and tons of these in the syslog... Aug 21 19:32:03 Harvey kernel: REISERFS error (device md11): vs-5150 search_by_key: invalid format found in block 0. Fsck? Aug 21 19:32:03 Harvey kernel: REISERFS error (device md11): zam-7001 reiserfs_find_entry: io error Aug 21 19:32:03 Harvey emhttp: get_filesystem_status: statfs: /mnt/user/Games Input/output error Aug 21 19:32:03 Harvey emhttp: get_filesystem_status: statfs: /mnt/user/Merkwell Input/output error Aug 21 19:32:03 Harvey emhttp: get_filesystem_status: statfs: /mnt/user/Public Input/output error Aug 21 19:32:03 Harvey kernel: REISERFS warning: reiserfs-5090 is_tree_node: node level 0 does not match to the expected one 6553 Link to comment
JorgeB Posted August 21, 2017 Share Posted August 21, 2017 reiserfsck has to be run in maintenance mode, i.e., with the array unmounted: https://wiki.lime-technology.com/Check_Disk_Filesystems#Drives_formatted_with_ReiserFS_using_unRAID_v5_or_later Link to comment
philouza Posted August 21, 2017 Author Share Posted August 21, 2017 doh... guess Rick and Morty will have to wait Link to comment
Recommended Posts
Archived
This topic is now archived and is closed to further replies.