[SOLVED] Strange messup with disks

CaptainSpalding · June 2, 2011

Sorry for not having a syslog, but for some reason the "telnet tower" or "telnet + ip" commands don't work. Is it because of Win7?

I'm running unraid 4.7

My problem:

A couple of days ago I replaced one Seagate disc with a WD 2T EARS disc. I had it MBR 4k aligned. I have not used jumpers on my previous EARS hdds and I have on unaligned EARS in my array ATM.

Did the data-rebuild and everything was fine. I have power cycled it with no probs.

But the new drive was outside the case the whole time and I thought I would install it inside. I did it, powered on and one other disk had a blue dot (Asking for a data-rebuild)... First I thought it was the cables, but tried to change the disk to other cable and had the same blue dot on that same disk. Then I thought if I would just change four cables randomly. Then it had four red dots. The disks order was not correct, so I was putting them in order when I noticed that the one slot that had the new disk was telling that the correct disc was the one that had been replaced?

There was the option to start and let the unraid to save the new disc order and I did that and started to run parity check straight after but I was getting major sync errors.

What could be wrong? How would the array understand that there is a new disc that is correct?

Thanks in advance!

syslog.txt

jazzysmooth · June 2, 2011

I'll withhold my comments on randomly changing cables...

In any case, when you moved the disk inside the machine, did you connect it to the same cable / sata port as it was connected to when outside the box? If not, its quite possible the disk order changed when you moved it inside.

Are you CERTAIN the parity disk is assigned correctly? If it isn't, and started Unraid, you just wiped whatever data was on that drive.

If you are comfortable that parity has been correctly assigned, and the data on the other drives in intact, you can type initconfig at the server prompt, which will remove the current configuration, allow you to reassign the drives, start the array, and it will recalculate parity. NOTE: I'd strongly suggest posting a syslog before going this route if at all possible.

CaptainSpalding · June 2, 2011

Thanks, I'll try with putty.

Parity was assigned correctly. I was just amazed how can replacing a hdd wipe entire disc.

CaptainSpalding · June 2, 2011

Syslog added.

The problem is the disc that has the 5004 as the last 4 digits. It had the blue dot.

Unraid also suggested that that slot was for 692Y that was that replaced disk.

dgaschk · June 2, 2011

The syslog shows no problem with disk7.

CaptainSpalding · June 2, 2011

No it does not. Because I saved the correct disc order, as the unraid prompted me to.

But parity check gives me a lot of sync errors.

dgaschk · June 2, 2011

The syslog does not show a parity check.

CaptainSpalding · June 2, 2011

I just power it up for the syslog.

Should I try to do a parity check and send a new syslog?

dgaschk · June 2, 2011

Yes.

CaptainSpalding · June 2, 2011

Ok, I will do that.

CaptainSpalding · June 2, 2011

Here's syslog after parity check.

syslog.txt

dgaschk · June 2, 2011

This does not show a parity check but it does show an issue with disk7. Does disk7 still show green? Posts a SMART report for disk7.

CaptainSpalding · June 2, 2011

It shows red and unformatted.

Syslog attached.

smart.txt

dgaschk · June 3, 2011

Does that drive have data on it?

CaptainSpalding · June 3, 2011

It should.

I have not formatted it or made a data-rebuild.

I noticed that the smart report was from a wrong disc. I replaced it with the correct report.

Rajahal · June 3, 2011

Sorry for not having a syslog, but for some reason the "telnet tower" or "telnet + ip" commands don't work. Is it because of Win7?

Yes, the telnet client is disabled by default in Win7. Here's how to enable it:

Start Menu,

Control Panel,

Program and Features,

Turn Windows features on or off,

Telnet Client (ensure it's checked/selected),

Click OK

PuTTY works fine too.

CaptainSpalding · June 3, 2011

Thanks Rajahal!

I think the problem lies here:

Jun 3 03:02:57 Tower logger: mount: /dev/md7: can't read superblock

Jun 3 03:02:57 Tower emhttp: _shcmd: shcmd (40): exit status: 32

Jun 3 03:02:57 Tower emhttp: disk7 mount error: 32

Jun 3 03:02:57 Tower emhttp: shcmd (42): rmdir /mnt/disk7

Jun 3 03:02:57 Tower kernel: REISERFS warning: reiserfs-5090 is_tree_node: node level 1 does not match to the expected one 4

Jun 3 03:02:57 Tower kernel: REISERFS error (device md7): vs-5150 search_by_key: invalid format found in block 94471506. Fsck?

Jun 3 03:02:57 Tower kernel: REISERFS (device md7): Remounting filesystem read-only

Jun 3 03:02:57 Tower kernel: REISERFS error (device md7): vs-13070 reiserfs_read_locked_inode: i/o failure occurred trying to find stat data of [1 2 0x0 SD]

What does those mean?

Joe L. · June 3, 2011

Thanks Rajahal!

I think the problem lies here:

Jun 3 03:02:57 Tower logger: mount: /dev/md7: can't read superblock

Jun 3 03:02:57 Tower emhttp: _shcmd: shcmd (40): exit status: 32

Jun 3 03:02:57 Tower emhttp: disk7 mount error: 32

Jun 3 03:02:57 Tower emhttp: shcmd (42): rmdir /mnt/disk7

Jun 3 03:02:57 Tower kernel: REISERFS warning: reiserfs-5090 is_tree_node: node level 1 does not match to the expected one 4

Jun 3 03:02:57 Tower kernel: REISERFS error (device md7): vs-5150 search_by_key: invalid format found in block 94471506. Fsck?

Jun 3 03:02:57 Tower kernel: REISERFS (device md7): Remounting filesystem read-only

Jun 3 03:02:57 Tower kernel: REISERFS error (device md7): vs-13070 reiserfs_read_locked_inode: i/o failure occurred trying to find stat data of [1 2 0x0 SD]

What does those mean?

It means you must fix the file-system corruption. TO prevent further corruption, the OS has made the file system read-only until it is fixed.

Follow the steps in the wiki here:http://lime-technology.com/wiki/index.php?title=Check_Disk_Filesystems

Joe L.

http://lime-technology.com/wiki/index.php?title=Check_Disk_Filesystems

CaptainSpalding · June 8, 2011

Thanks Joe!

I did the rebuild-tree and now seems to be fine with reiserfsck -check.

But it still has a red dot. Now it's not showing as unformatted, but it's not in the array.

Syslog included.

syslog-2011-06-09.txt

CaptainSpalding · June 9, 2011

I just thinking could the "Start and let the unraid to save the new disc order" be the reason for the red dot?

Could it be possible that the parity saved the broken file system when I applied that option?

And now that it is fixed it thinks its not correct?

When I saved the new disc order all dots were green, tried parity check and got massive amount of errors. Rebooted and the 7th drive showed as unformatted.

Would it be safer to just rebuild the drive from parity?

Joe L. · June 9, 2011

Thanks Joe!

I did the rebuild-tree and now seems to be fine with reiserfsck -check.

But it still has a red dot. Now it's not showing as unformatted, but it's not in the array.

Syslog included.

A disk with a "red" indicator has had a "write" to it fail.

It will not go back into service simply because you fix the reason for the the write failure (loose cable)

The only way to get the same drive back is to allow unRAID to re-construct the contents onto itself.

(including the "write" on that had originally failed)

The only way unRAID will re-construct onto a drive is for it to think it is a "new/different" drive than original. You can fake out unRAID to think the old disk is its own replacement by

Stopping the array

Un-assigning the failed disk

Starting the array with the disk un-assigned (this causes unRAID to forget the model/serial number)

Stopping the array

Re-assigning the failed disk

Starting the array once more. (this will re-construct the disk based on parity and all the remaining disks)

Once the failed disk is re-constructed you also need to perform a parity check, to ensure the contents you wrote is readable.

Joe L.

CaptainSpalding · June 9, 2011

Thanks for the clear instructions!

I just want to make shure that that option to get "unraid to save the new disc order" didn't mess my parity up?

Thank you so much for your time!

CaptainSpalding · June 10, 2011

The data-rebuild did not finnish, heres the syslog:

Drive still has red dot and has 288 errors after the data-rebuild.

Is that hdd a goner or the parity?

syslog.zip

CaptainSpalding · June 11, 2011

Sorry to bump this, but do you think that a new hdd would solve this?

CaptainSpalding · June 14, 2011

I'm getting a new hdd today, but I still would like to ask one thing.

When reiserfsck is used, does it just fix something that a data-rebuild would too?

So if I have one disk that has write errors, could I just connect a new one and start data-rebuild there?

Or does the reiserfsck some how change the parity data?

Bare with me, I'm trying to figure this situation and I'm no linux-savvy person. :-[

[SOLVED] Strange messup with disks

Recommended Posts

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Archived