September 28, 200619 yr Okay, had booted and run Beta4 fine since just after it came out. Tonight I decided to swap in a new disk and expand one of my drives. that disk was drive11 and I was swapping a 320Gig drive for a new 400. Upon booting it up the first time a cable was bad up top and I halted the array, powered it down, and rejiggered the cable. Brought the array up and all looked good! Array gave me the option to expand drive and disk 11 was a blue dot - cool so far. i noted that it was giving me the option to format a drive but obviously didn't take it. A few moments ago I tried to bring up Disk8 and was rejected. Upon looking at the status I see that Disk 11 is now Orange and listed as "Unformatted" and that Disk8 is ALSO Orange and listed as unformatted. No writes had been occuring to disk8, only to disk11 and the drive type etc. was recognized correctly. I used Telnet to try and examine Disk8, it's not showing me anything. I tried to do the same thing with Disk11 and I receive the same feedback - nothing there. Screen output looks like the following, note that I have stopped the rebuild for fear that it's putting down bad data on Disk11 because it sees no data on Disk8 I don't know what to do next and I fear I'm about to lose data on both disks - this would be "bad"! [size=10pt]Disk status Model/Serial No. Temperature Size Free Reads Writes Errors parity ST3400632A/5NF108P4 37°C 390,711,352 - 480,583 7 0 disk1 ST3400632A/5NF0Z9KP 38°C 390,711,352 110,027,980 483,794 1 0 disk2 ST3400620A/3QG043W1 37°C 390,711,352 1,316,780 483,575 1 0 disk3 ST3400620A/5QH00QT6 38°C 390,711,352 1,943,860 483,575 1 0 disk4 ST3300831A/4NF08RJX 37°C 293,036,152 710,596 482,822 1 0 disk5 ST3300831A/3NF0WH07 38°C 293,036,152 48,467,128 482,835 1 0 disk6 ST3300831A/3NF06PTA 38°C 293,036,152 1,094,612 486,687 1 0 disk7 Maxtor 6L300R0/L60MYRFG 37°C 293,057,320 2,386,176 482,830 1 0 disk8 ST3300831A/4NF08RMG 36°C 293,036,152 Unformatted 480,576 0 0 disk9 WDC WD3000JB-00KFA0/WD-WCAMR1641443 34°C 293,036,152 1,081,692 482,821 1 0 disk10 WDC WD3000JB-00KFA0/WD-WCAMR1587689 35°C 293,036,152 885,580 482,821 1 0 disk11 ST3400620A/5QG02J9B 39°C 390,711,352 Unformatted 0 480,578 0 [/size] Heeelp! P.S. On the plus side - I have the previous disk that was in #11s slot. It occurs to me that if I've lost data on #8 I can likely retrieve it by reinserting #11 and rebuilding. That's ASSuming the Parity is still correct of course. Fingers crossed!
September 29, 200619 yr Author For anyone who might be wondering - Tom and I are working this offline, he's not ignoring me. Just trying to figure out WTF happened right now.
September 29, 200619 yr Author Not heard from Tom this evening so I am carefully poking around a little. Array is powered down, I have removed disk8 from the array. Worth noting is that disk 8 was in the cage from which I removed disk11 so it is possible this had something to do with my tinkering. Anyway, disk8 removed and mounted on my XP system, I brought up an info screen using rfstool and this is what it had to say about disk8 - not looking good! [size=10pt]----- Info on \\.\PhysicalDrive0 ------ Drive uses XP-style DISK_GEOMETRY_EX: Cylinders = 36481 MediaType = 12 TracksPerCylinder = 255 SectorsPerTrack = 63 BytesPerSector = 512 DiskSize = 300069052416 Got DRIVE_LAYOUT_INFORMATION_EX PartitionCount = 4 PartitionStyle = 0 (MBR) Signature = 0x97210123 --- PARTITION 0 --- PartitionStart = 32256 PartitionLength = 300069020160 (286168.12 MB) PartitionType = 131 Read 512 bytes at offset 97792 [b]Doesn't seem to be a ReiserFS partition [/b] (00000000 00000000 00000000)[/[/size]tt] Ooookay.... So, if Disk8 is no longer good apparently but I still have the old Disk11. I think I will try mounting it in the same manner and make sure this tool is working etc. etc. I MAY try putting the old disk11 into the system and bring the system up to see if it wants to rebuild 8. Will post back in a bit!
September 29, 200619 yr Author Slapped in the old Disk11 and checked it with rfstool and it sees the drive fine as Reiserfs. Gotta' say rfstool included with YAReG 1.0 is pretty good! so, I think I can rebuild Disk8 using the old Disk11 and the current Parity which I believe is good. Then try the upgrade again - we'll see. Below is what rfstool reported for me in case anyone cares... ----- Info on \\.\PhysicalDrive0 ------ Drive uses XP-style DISK_GEOMETRY_EX: Cylinders = 38913 MediaType = 12 TracksPerCylinder = 255 SectorsPerTrack = 63 BytesPerSector = 512 DiskSize = 320072933376 Got DRIVE_LAYOUT_INFORMATION_EX PartitionCount = 4 PartitionStyle = 0 (MBR) Signature = 0xbf87a178 --- PARTITION 0 --- PartitionStart = 32256 PartitionLength = 320072901120 (305245.31 MB) PartitionType = 131 Read 512 bytes at offset 97792 Is ReiserFS partition type 2 (73496552 46327245 00000073) P.S. got your mail Tom, responding now!
October 3, 200619 yr Author Okay, so here's how thigs have gone.... While expanding a disk I noticed that a disk not being worked on was coming up listed as unformatted. I still had the original disk for the one I was attempting to expand and upon inserting this the system wouldn't start since it was looking for the disk I had been trying to expand. Tom worked with me and the next step was to swap around some OTHER disks. This was supposed to get the system to rebuild the disk that had come up unformatted. I even backed up the disk that was coming up unformatted. Unfortunatly instead of trying to rebuild that disk it proceeded to trash the disk I'd previously tried to expand. So far as I could tell Parity was good but once it trashed that disk this was akin to a double failure. no forensics software I could find was able to find files on either disk and reiserfsck was also unable to correct this. I'm actually still running that on the second disk now but judging from the previous results and the extremes I'm having to goto right now to even get it to run I'm not hopeful. It's giving some info that seems to say there might be data but I'm skeptical. The F/S signature was comeplete zeroes for both disks it seems, ouch. Not sure what to say at this point, not sure exactly what happened or why. I have three disks to put into the array and am a bit gunshy but need to do it. This is why I've put in a request for a feature to be better able to control the array. I'd like to have been able to specify what disk I wanted rebuilt instead of the software choosing the wrong one. As it stands now I'm out about 320Gig worth of DVDs on one disk, 190+gigs worth of files, and about 100 more gigs worth of DVDs on that drive unless something comes back. Luckily I have a My Movies database that lists where DVD were located on disk so I can at least know what to rerip but I'll have to dig those DVD out of storage - doh! Anyway, that is how this is ending up right now. Tom has worked with me a good bit to try and get this back and last we traded e-mail he was puzzled as to what happened too. Since no one else has had this issue and I'm pretty sure Tom has tested like mad I dunno' what occured - some quirk to my setup I guess. If I get any data back I'll update here. Mind you if this had been a standard RAID with data striped I'd be looking at a 4TB or so loss so I guess I cannot be too bitter. My MP3 are safe but darn it I lost alot of TV shows! Ah well, could obviously be alot worse!
October 3, 200619 yr There are a few details missing from this saga and the sequence is a bit off. Be that as it may, a very useful thing to do before attempting to change the array disk config is to save a copy of the config/super.dat file on the Flash. This file captures the state of the array: disk config, disabled disk status, parity valid/invalid status, etc. So you could make a copy of this file, call it say super.bak, and do the upgrade/expand operation. If things go south, you could restore the file and all is well again. I should have produced a "tech note" about this sooner because it might have helped in your particular case. In the next s/w rev, we'll make this automatic (making a backup of the super.dat file).
October 3, 200619 yr Author Ya, to explain the whole sequence of events would've taken all night and I was pretty tired. Not sure what went wrong on the disk I wasn't upgrading but having had a known good disks everywhere else and Parity that I'm pretty sure wasn't written to should've meant it was recoverable. However it wasn't because the software did what it thought was right and there was no way that I was aware of to override it much less know what it was proposing to do until it was too late. Hindsight is 20:20 and anything that can be done to prevent this from occuring to someone else would be a welcome lesson. In the end the data I lost is replaceable with some great effort on my part so that's something at least. Last night after my post above Resierfsck gave up on the second drive, no data restored. I now have to switch back the two drives we switched around troubleshooting and will probably put in new drives in place of both of the wiped drives. Not sure what state the array will be in at that point with so many new unformatted drives but I guess well see. I hope to not lose any more data though! I suspect I'll have to reset the config, I'll look at it when I get home. I think it's a pretty well known fact that if a double failure is experienced we'll lose data with this system. That's an acceptable risk to me since on many other systems a complete loss of data would have been experienced. It's just frustrating that this error's cause is a mystery and that recovering the data should have been possible but wasn't because I couldn't override the array. It wrote all of 10K to the old "good" drive and that was enough to completely blow it away - had I known what it was going to do beforehand I could've prevented that loss of data and still had a shot at rebuilding the other drive if I could have directed it to do so. Anything that can be added to allow for that in the future would be appreciated.... Super.dat - is that file something that can be\could've been manually modified? Is it a binary format or textual? If textual than why was I swapping disks to get the array into a state where it would try to rebuild a disk (the wrong ne at that) when I could've simply modified that file to tell it to rebuild Disk8 and assume Disk11 was good (which it was)? If I'd had some manual control over that file and Parity was indeed good then it sounds like I'd be doing the Snoopy dance now rather than facing the job of digging through boxes to find a ton of DVDs to re-rip amongst other losses.
October 3, 200619 yr Author Hrm, I've replaced the two disks that got hosed, started the array, halted the rebuild it wanted to do and now tried to format the "unformatted" disks. It starts and then stops the format. It claims that there's an invalid data disk in there and wants to rebuild. Since it cannot possibly bring that disk back I'm not sure how to fix this. If I use "restore" it says it will leave the data alone on the disks but will it still recognize that two disks are unformatted and need to be formatted? Okay, did the reset thing and formatted, Doing a Parity now. Thankfully some of my torrents are still around so I'm able to reclaim that at least. Whew!
Archived
This topic is now archived and is closed to further replies.