New red ball while rebuilding

philouza · August 18, 2017

Running version 5.0.5. Got a red ball on one of my disks (PL1331LAGRTP6H) a week or so ago. Smart report and test came back ok, so figure it was the sata controller or the cable. Moved the drive to a different controller with a different cable and went through the unassign and then reassign the disk procedure to start rebuilding it.

A few hours into the rebuild another disk (PL1331LAGSAUDH) red balled which seemed to have halted the rebuild. Unfortunately the smart report shows the drive is failing. I can still access the drive, however the data looks corrupt and I can only see ~12gb of the 3.7tb via samba.

So as of now, my array has PL1331LAGRTP6H with an orange ball that was in the middle of rebuilding and now PL1331LAGSAUDH with a red ball that seems screwed.

Have a new 6tb drive handy. Have attached smart reports for both drives and a syslog from last reboot. Sorry, should of saved the syslog when the second drive fell over. Currently have the array stopped and it's showing 'configuration valid'. Hope someone can help. Cheers.

syslog

PL1331LAGRTP6H.txt

PL1331LAGSAUDH.txt

JorgeB · August 19, 2017

7 hours ago, philouza said:

I can still access the drive, however the data looks corrupt and I can only see ~12gb of the 3.7tb via samba.

That is normal since you have 2 invalid disks with single parity, unRAID can't correctly emulate the missing data.

Assuming disk12 data is unchanged since it first became disable, you can do this, but it's possible disk12 will have some corruption, though very little, because it stopped during the rebuild:

-Utils -> New Config
-re-assign all disks, double check parity is the parity slot
-check "parity is already valid" before starting the array
-start the array

Now check if data on disks 11 and 12 looks OK.

philouza · August 20, 2017

On 19/08/2017 at 6:30 PM, johnnie.black said:

Now check if data on disks 11 and 12 looks OK

Thanks so much for the reply. Did as you instructed and the array came back up. Kudos. Started a parity check and disk 11 eventually red balled again killing the check. Tons of write errors, so confident the drive is shot. See syslog _disk11_fail attached. Replaced with a new 6tb and currently rebuilding which seems to be going ok with the exception of these errors in the current syslog..

Aug 20 19:22:00 Harvey kernel: REISERFS warning: reiserfs-5090 is_tree_node: node level 24537 does not match to the expected one 1
Aug 20 19:22:00 Harvey kernel: REISERFS error (device md11): vs-5150 search_by_key: invalid format found in block 345831560. Fsck?
Aug 20 19:22:00 Harvey kernel: REISERFS error (device md11): vs-13070 reiserfs_read_locked_inode: i/o failure occurred trying to find stat data of [587 1500 0x0 SD]

Attached that syslog below too. Assuming I need to wait till the rebuild is finished, would I then run a reiserfsck --check against disk 11, or should I kick off a parity check first and go from there?

syslog_disk11_fail

syslog

JorgeB · August 20, 2017

1 minute ago, philouza said:

Assuming I need to wait till the rebuild is finished, would I then run a reiserfsck --check against disk 11,

Once the rebuild finishes run reiserfsck, there may also be some file corruption because we made parity valid when it really wasn't, but with 2 invalid disks it was your best option, keep old disk11 intact, you may still be able to copy some/most data if needed.

Consider upgrading to unRAID v6 and using dual parity, IMO it's recommended for your array size.

philouza · August 21, 2017

Ok ran reiserfsck --check /dev/md11 and got the following...

Comparing bitmaps..vpf-10640: The on-disk and the correct bitmaps differs.
Bad nodes were found, Semantic pass skipped
50 found corruptions can be fixed only when running with --rebuild-tree

Am I good just running reiserfsck --rebuild-tree /dev/md11 or should I add any options like '-S' or '--scan-whole-partition'?

JorgeB · August 21, 2017

Use just --rebuid-tree

philouza · August 21, 2017

Cheers mate. You are such a superman on this forum. Saving us citizens daily.

philouza · August 21, 2017

Normal to not see your files during the rebuild-tree? All my shares (NFS and Samba) are empty and tons of these in the syslog...

Aug 21 19:32:03 Harvey kernel: REISERFS error (device md11): vs-5150 search_by_key: invalid format found in block 0. Fsck?
Aug 21 19:32:03 Harvey kernel: REISERFS error (device md11): zam-7001 reiserfs_find_entry: io error
Aug 21 19:32:03 Harvey emhttp: get_filesystem_status: statfs: /mnt/user/Games Input/output error
Aug 21 19:32:03 Harvey emhttp: get_filesystem_status: statfs: /mnt/user/Merkwell Input/output error
Aug 21 19:32:03 Harvey emhttp: get_filesystem_status: statfs: /mnt/user/Public Input/output error
Aug 21 19:32:03 Harvey kernel: REISERFS warning: reiserfs-5090 is_tree_node: node level 0 does not match to the expected one 6553

JorgeB · August 21, 2017

reiserfsck has to be run in maintenance mode, i.e., with the array unmounted:

https://wiki.lime-technology.com/Check_Disk_Filesystems#Drives_formatted_with_ReiserFS_using_unRAID_v5_or_later

philouza · August 21, 2017

doh... guess Rick and Morty will have to wait

New red ball while rebuilding

Recommended Posts

philouza

Link to comment

JorgeB

Link to comment

philouza

Link to comment

JorgeB

Link to comment

philouza

Link to comment

JorgeB

Link to comment

philouza

Link to comment

philouza

Link to comment

JorgeB

Link to comment

philouza

Link to comment

Join the conversation