January 20, 201016 yr I've been running the 3 drive free version for a while, just purchased Pro and a new drive as I was running out of space. Since the new drive is 7200 RPM vs 5900 on the others, I figured I'd make the new one the parity drive. So I added the drive, reassigned it as the parity drive and kicked of a sync. About 30% in I notice the drive temps were a little higher than I'd like so I opened the case and checked the fans, etc - and I guess I must of brushed up against one of the data drive sata cables. Suddenly disk 2 was showing 0 degrees and tons of write errors (which I thought was strange since it should of only been doing reads). I stopped the sync, shut down the server to finish my cable work, and then started it back up. At that point Disk2 showed as unformatted. I did a few searches and ran reiserfsck on the disk, it shows no errors and I can see the data when its doing the check. However each time I start Unraid, it shows as unformatted. I have not tried the restore button, and I do still have the original (good) parity drive which is installed but not assigned (my plan was to make that a data drive after the parity sync was complete.) I've attached the syslog. Any assistance would be greatly appreciated! Brian syslog.txt
January 20, 201016 yr Author Just as a followup, I'm able to mount the disk via telnet and access it without issue... Weird (to me at least).
January 21, 201016 yr Author Ok, well being an impatient person and not seeing any responses, I went and fixed it, although likely not in the easiest manner. I removed the drive it saw as unformatted as well as the parity drive (since I was going to be dumping a bunch of data into it, figured I'd wait until the end to calculate the parity), added the old parity drive as a data drive and clicked restore. Then after formatting the new data drive, I mounted the "unformatted" drive manually via telnet and used Midnight Commander to copy the data from the mount over to the new data drive. I then readded the new parity drive and the "unformatted" drive with the intention of having Unraid format it and then sync parity, when I was surprised to see that Unraid was now mounting the old data drive correctly. Since I basically now had 2 copies of all that data, I used Midnight Commander to remove the duplicate data and the kicked off the parity sync. Still curious though as to what I should of done? I didn't immediately run reiserfsck --rebuild-sb due to the other posts recommending not to use the fix options if no corruption was found. Also curious as to why Unraid was able to recognize the drive as formatted when I readded it as the 4th drive, in a system that previously only had 3 drives?
March 9, 201016 yr It IS a little confusing. Sorry for such a late response! It would have helped if you had known to capture the syslog in that session when the trouble first occurred prior to shutting down to fix the cabling. Your syslog from the subsequent boot indicates why Disk 2 shows as 'Unformatted'. It found the 4 drives, and partitions on each, but when it attempted to start the array and mount the drives, 2 critical starting blocks of the Reiser file system on Disk 2 could not be read, so a Reiser file system could not be identified, and this becomes the one case where the partition truly does look unformatted - no file system. You retried 3 more times, with the same result. Of interest is the fact that one line of the syslog during this period includes some unusual overwriting, and there are some unusual delays that occur during the fourth attempt, which *may* indicate an unstable system. The syslog ends at that point, so none of the later work is logged. When you ran reiserfsck, it *should* have failed very quickly with an instruction to rerun with a more serious repair parameter, most likely --rebuild-sb. I don't understand how it could have proceeded at all at that point, and although I hate to suggest it, I wonder if your memory of the sequence of actions and results at that time may possibly be erroneous? (I do apologize for suggesting it, and realize it's a long time ago now.) There is essentially no difference between the way unRAID mounts the file system, and a manual mounting of the file system, so it is not surprising later, that after you were able to mount it and copy files, that unRAID would do the same. It recognizes a drive with a good Reiser 3.6 file system in the first partition as a normal unRAID drive. That would also indicate that at some point earlier, reiserfsck was successful in fixing the initial corruption in the Reiser superblock, or those initial blocks were somehow rewritten correctly. As to why the block-read errors occurred, that is rather strange, and I can't explain what really happened. There are NO drive exceptions here, or even delays that could be associated with trouble reading the sectors. Within the same second as the starting of the array, the 'bread' (block-read) errors occur, and the mounting is aborted for Disk 2, which I find strange. There is no indication as to why the reads could not occur, and the drive itself does not report any problems. If there had been delays in the timeline, then I would have suggested that the cable glitch may have caused an electrical spike that corrupted sectors, and the drive is taking longer than expected to read the blocks, while it tests and 'error-corrects' those sectors, and possibly even remaps them, which makes them completely readable later. But that process takes *time*, and that time would be obvious in the syslog.
March 10, 201016 yr Author Well first I'd like to say I appreciate you responding, even if it is almost 2 months later . As you said, it occurred some time ago, but I'm pretty certain I never ran reiserfsck with the -rebuild option, namely because it didn't give an error during the check, and because I was able to manually mount the disk without any issue. Ultimately I got the system back up without any data loss which was my utmost concern. And the system has been running flawlessly so I don't believe there are any outstanding hardware issues. The long and short of it is I caused the issue by attempting to working on the pc while it was performing the sync - something I won't try again.
March 11, 201016 yr I guess this was missed. You could have put the old parity drive back in and got unRAID to rebuild the drive that appeared to have failed onto the new drive. Basically, I would have said to first run a preclear script on the new drive. Then assign the old parity drive as parity again and the new drive in place of the failed one. Hit the restore button. On the console, type 'mdcmd set invalidslot 2 Start the array I think that covers it. Then, it would rebuild disk 2 onto the new drive, the "invalidslot 2", instead of rebuild parity onto the parity drive which would be a "invalidslot 0" and is the default command after the restore button was pressed. You still might have had to do a file system check or recovery to get the data to work, I'm not sure about that part until someone tries the above first. Peter
Archived
This topic is now archived and is closed to further replies.