A disk has become "read only"

July 28, 201015 yr

From time to time my toddler daughter gets into the room the server is in and knocks the eSATA external boxes that 9 of my drives are in. This can cause one or more of my drives to lose the connection and appear as failed.

I had a failed drive recently, and since it had been working fine, I assumed it was just a bad connection. I got it reassigned it and did the "trust my parity" on it. All data appeared to be intact, but now no writes or deletions can be carried out on that drive. I have been able to copy anything FROM it, though.

The console is showing a:

REISERFS error (device md14): vs-4080 _reiserfs_free_block: block 224913xxx: bit already cleared

I can't tell how many lines exactly but xxx= 958 to 947 are shown on screen currently.

I have enough space on another disk in the array to copy all the contents of this 1 TB drive over. THis would be a back up in case rebuilding the drive fails for some reason. I could then replace the disk, or if anyone suspects it could be fine, maybe I'd do a preclear and see what happens. If all is well I could replace it in the array... or just RMA it.

Any advice greatly appreciated.

Quote

July 28, 201015 yr

From time to time my toddler daughter gets into the room the server is in and knocks the eSATA external boxes that 9 of my drives are in. This can cause one or more of my drives to lose the connection and appear as failed.

I had a failed drive recently, and since it had been working fine, I assumed it was just a bad connection. I got it reassigned it and did the "trust my parity" on it. All data appeared to be intact, but now no writes or deletions can be carried out on that drive. I have been able to copy anything FROM it, though.

The console is showing a:
REISERFS error (device md14): vs-4080 _reiserfs_free_block: block 224913xxx: bit already cleared
I can't tell how many lines exactly but xxx= 958 to 947 are shown on screen currently.

I have enough space on another disk in the array to copy all the contents of this 1 TB drive over. THis would be a back up in case rebuilding the drive fails for some reason. I could then replace the disk, or if anyone suspects it could be fine, maybe I'd do a preclear and see what happens. If all is well I could replace it in the array... or just RMA it.

Any advice greatly appreciated.

You need to

1. Get a SMART report on that drive

smartctl -d ata -a /dev/sdX

where sdX = the device corresponding to disk14.

2. Check the file-system following the procedure in the wiki. The file-system is being made read-only to prevent you from doing damage to it while it is in a corrupted state. Once repaired, odds are it will be fine for many years to come.

The procedure is here:

http://lime-technology.com/wiki/index.php?title=Check_Disk_Filesystems

Joe L.

Quote

July 28, 201015 yr

Author

Thank you, Joe. I will get these procedures started and post back the results. You (especially) and a few others on these forums deserve a medal.

Quote

July 28, 201015 yr

Author

SMART report attached.

smart.txt

Quote

July 28, 201015 yr

smart report looks fine. Odds are the repair of the file-system will be all that is required to get you back to where you can write to the drive once more.

Quote

July 28, 201015 yr

Author

smart report looks fine. Odds are the repair of the file-system will be all that is required to get you back to where you can write to the drive once more.

Thanks.

REISERFSCK result (last 2 lines):

Bad nodes were found, Semantic pass skipped
11 found corruptions can be fixed only when running with --rebuild tree

So what now?... there's a lot of red lettering when it comes to running with --rebuild tree in the wiki.

Quote

July 28, 201015 yr

smart report looks fine. Odds are the repair of the file-system will be all that is required to get you back to where you can write to the drive once more.

Thanks.

REISERFSCK result (last 2 lines):
Bad nodes were found, Semantic pass skipped
11 found corruptions can be fixed only when running with --rebuild tree
So what now?... there's a lot of red lettering when it comes to running with --rebuild tree in the wiki.

It says to not run it unless it is instructed by a prior run of reiserfsck. You have been instructed.

Joe L.

Quote

July 28, 201015 yr

Author

Now I've been instructed twice! I was just wondering whether it would be prudent to copy as much data off the disk as I can first, especially if the rebuild tree can "leave the file system in worse shape than it originally was!"

So I started to do it, and then the warning that came up scared me a little. I've decided to see if I can back up as many of the movies and TV shows I have on the faulty disk to one that has space on it before rebuilding the tree.

Quote

July 28, 201015 yr

Now I've been instructed twice! I was just wondering whether it would be prudent to copy as much data off the disk as I can first, especially if the rebuild tree can "leave the file system in worse shape than it originally was!"

If you wish.... It certainly cannot hurt.

What will happen with the rebuild-tree is it will create a lost+found directory to put files, parts of files, and directories that it cannot identify. Those same files will probably not be copyable, since they cannot be reached.

Quote

July 28, 201015 yr

Author

Yeah, I figured that might be the case. I may as well try to rebuild the tree now, I guess. So far the files that haven't been copyable are just tag data generated by a program, and only a few kb each.

Quote

July 28, 201015 yr

Just an FYI

You said:

, I assumed it was just a bad connection. I got it reassigned it and did the "trust my parity" on it. All data appeared to be intact, but now no writes or deletions can be carried out on that drive. I have been able to copy anything FROM it, though.

The data disk was taken out of service because it could not be written. There is a 100% chance that there is a file or directory, or something that is NOT correct.

The parity disk was updated however. Let's say you wrote to the disk for hours before discovering it was "red"

If you had elected to re-construct the data onto the drive that had not properly been written to it would have had all those files you wrote during those hours.

Instead, you elected to "trust" the data disk was correct, and parity wrong. Meaning all the files written would be lost. (Remember, we are certain the data disk is not right, since at least one "write" to it failed. Possibly many "writes" failed.)

Next time don't be so quick to use the "trust" procedure. It is more likely than not to get you into situations as you are now. Instead, elect to re-construct the data onto the drive which had gone off-line. It probably would have been correct.

1. Stop the array

2. Un-Assign the disk that became disconnected.

3. Power down

4. Fix the bad connection

5. Power Up

6. Start the array with the disk un-assigned (this will cause unRAID to forget its model/serial number)

7. Stop the array once more

8. Re-assign the disk. (It will treat it as a new replacement, since it forgot its original model/serial number)

9. Start the array one last time. (It will re-construct the contents of the drive back to itself based on parity and the remaining other drives.)

When the re-construction is complete you'll have parity protection once more AND all the data that could not be written to the drive when your daughter broke the connection to it..

Joe L.

Quote

July 28, 201015 yr

Author

Thanks, Joe. I normally am more cautious about these things, and would have probably reconstructed the data. However, in this case I know I have all the data that I tried to write to the array since any possible time the write-error occurred. I will, however, be more careful in future. Lesson learned.

Quote

A disk has become "read only"

Featured Replies

Archived

Account

Navigation

Search

Configure browser push notifications

Chrome (Android)

Chrome (Desktop)

Safari (iOS 16.4+)

Safari (macOS)

Edge (Android)

Edge (Desktop)

Firefox (Android)

Firefox (Desktop)