July 9, 200817 yr Hey guys, I upgraded from 4.2.1 to 4.3.2 and went by the install notes found in the release. Come to find out after reading here, there were some big issues with the drives being renamed (WD in particular) and this wasn't found in the install notes I read. I went through the upgrade thread and read the faqs for a solution. After upgrading though, I remember there were three disks listed in red as out of order. I think I have them in the right position (Can't confirm though) and have tried the reset button (after trying the start button) and tutorial from the other thread by Tom on how to do a manual reset. The issue now is the reset doesn't work and when I hit the "Start" button the disks turn green, but the status says "Stopped. Configuration valid." I get no options for a parity calculation or anything. It's like half of it thinks the array is started while the other half doesn't. Any ideas what I"m missing? Thanks
July 9, 200817 yr Basically, the partition that the unRAID server is expecting to exist on one of your data drives does not exist. This would suggest that you do not have the devices assigned to the proper slots. The parity drive will not have a file-system, the others all should have a file-system. Joe L.
July 9, 200817 yr Author Any way to check what is supposed to go where without having a prior upgrade screenshot? Maybe some mount in linux command to see what is on them? Thanks for your quick response
July 9, 200817 yr Any way to check what is supposed to go where without having a prior upgrade screenshot? Maybe some mount in linux command to see what is on them? Thanks for your quick response Yes, you can attempt to mount them via the command line, or you can do almost the same thing from the unRAID interface, since one of the first things it does is to mount the file-systems. Any it cannot mount show as "unformatted" Odds are good that a single un-formatted drive was your parity drive. READ THESE ENTIRE INSTRUCTIONS SEVERAL TIMES BEFORE YOU DO ANYTHING MORE. To proceed, go to the devices page and un-assign the parity drive. Assign it instead as a data drive. Now, you will not have ANY parity drive assigned, and all the drives will be assigned as data drives. If you were to press "restore" at this point, (with NO extra commands typed to skip parity calcs) the array should attempt to mount all of your drives and present you with the option of starting your array. One of your drives (hopefully only 1) will show as "unformatted" with a checkbox to format it. DO NOT CHECK THE BOX, DO NOT START THE ARRAY, DO NOT PASS GO, DO NOT COLLECT $100. (sorry, I got carried away) If you see more than one drive as "unformatted" that is bad. It indicates you assigned a data drive to the parity slot and wrote to it enough to corrupt the file system that used to be on it. If you see only one drive as "un-formatted, and it is one of your largest disks, you probably have identified the parity disk. (remember, parity has to be as large or larger that any other disk) To confirm, go back to the Devices page, un-assign the parity drive (the one disk that showed as un-formatted), go back to the main page, press "restore" once more. At that point, all the drives should be blue. If so, you can start the array. With no parity drive assigned it should come up online and you can verify all your data is still there. If any device should show as unformatted, DO NOT start the array, DO NOT re-format the drive, etc. If all your data looks good, then stop the array, go back to the "devices page" and re-assign the parity drive as parity. Then, go back to the main page and start the array once more. It should come up and start the process of re-calculating parity. If it does, you can either let it complete, or stop it, stop the array once more, press "restore" use the set invalid slot command, and then start the array one last time. Let the parity check complete. Odds are it will not find anything, but you will want to be certain. Joe L.
July 9, 200817 yr Just a quick message to warn you NOT to do any thing drastic yet, until we have more time to analyze. Your syslog indicates 2 drives with names too long, but also a drive creating 'exception Emask' errors of 'ATA bus error', which is keeping the drive from being read, which makes it look unformatted. This drive is NOT the parity drive, as it is only 200GB. This is just a quick 'hold up' message while I/we look at it some more, and write up additional thoughts. One thing I can recommend is replacing the drive cable to Maxtor_6L200R0_L50S35XH, currently considered Disk 1 (sdh). more in a bit...
July 9, 200817 yr Author Before reading your message Rob, I went ahead and un-assigned the parity drive then assigned it to a data drive, hit restore (not start). It gave me the option to start the array after that, but didn't list any drive as being un-formatted. I have since reset the drive back as parity and am awaiting further instruction. I replace the cable to the 200GB maxtor to be safe.
July 9, 200817 yr A new copy of your syslog would not hurt... Rob will be better able to analyze. Interesting in that unRAID did not show you any "unformatted" drives. I did not expect that. Good that you stopped for further advice. Joe L
July 9, 200817 yr You appear to have had the great misfortune of upgrading at the same time a hardware issue occurred, terrible timing unfortunately. Is there any chance you have a recent syslog, before the upgrade? I spoke too quickly about the position of the 'bad' Maxtor. I specified the right drive, Maxtor_6L200R0_L50S35XH, but it is currently considered Disk 2 (sdh), not Disk 1. You have an nForce 3 board, and the drive problem involves a Maxtor 6L200R0, a combination that is also troubling to me. I began my unRAID experience with an nForce 4 board, and had several drives I could not get to work consistently, the worst of which was a Maxtor 6L300R0, which I finally decided was not compatible with the early nForce series boards. Since your board seems to have been working until now, you will want to continue with it, but I would strongly urge you to consider upgrading to something newer or different, at least no nForce boards older than nForce 5 series. It is possible that if you power down and remove power completely to your unRAID array (for 10 seconds), that it will power up correctly, that is, this Maxtor will begin to respond again to drive commands. In my case, that was true with my similar Maxtor. If I forgot and allowed it to spin down, then only a period of complete power isolation would reset it sufficiently to begin talking again. For me, starting the array from complete power off produced better stability and fewer errors than trying to reboot the array, at least with my nForce 4 board. In your case, the 'ATA bus errors' and possibly the 'ICRC ABRT' messages seem to point to either bad communications or a drive not responding correctly, which could either be a bad cable or a drive that needs a complete reset, total power off. I don't see any connection between this 'bad' drive and the upgrade to 4.3.2. (The other 2 drives with long model names ARE an upgrade-related issue.) The fact that this occurred during the upgrade just appears to be bad timing. A parity check at this time could have brought the problem to light, and I imagine Brian may want to chime in here about the need for periodic parity checks, justifiably here, although in this case, just trying to restart the array (without the upgrade) would have probably exposed the problem. The 2 drives with long names are the WDC_WD5000AAKS-00YGA0_WD-WCAS80936177, currently Disk 7 (sdd), and Hitachi_HDT725050VLA360_VFK401R41ZBLSK, currently Disk 10 (sde). Elsewhere in the discussion of the Western Digital's, there was a brief mention of an Hitachi, but it did not look like others picked up on that. This is the second Hitachi with the same too-long-model-name issue. At this time, I would recommend restoring v4.2.1 to your flash drive, because we really don't want to have to deal with the upgrade problems until your array is fully valid again, including parity. I really hope we haven't lost your parity drive, as it is possible you may need it to rebuild Disk 2, the bad Maxtor. Hopefully, a complete power off and/or the replacement cable will bring it back to life. Just saw the 2 new messages. The fact that you saw no 'unformatted' drives is very good news! A new syslog is imperative, before we do anything else. I still would like to see an older syslog, before the upgrade, to confirm the correct drive positions, and check for issues with that Maxtor.
July 9, 200817 yr The new syslog confirms the Maxtor is working again. The array was not started, so unRAID has not yet examined the drives for valid Reiser file systems. But if you/we can confirm that the drive positions are correct, particularly the parity drive is assigned correctly, then you don't need to revert to v4.2.1, but can proceed with Tom's modified upgrade procedure (and let the parity check proceed all the way through). We just need an older syslog to confirm the drive positions, unless you can find notes of all the drive assignments. The safest procedure of all would be if you had a backup of your flash drive prior to the upgrade, and could restore it and do a full parity check, then proceed to upgrade. One thing that puzzles me: you have 2 Maxtor 6L200R0's, with very similar serial numbers, L50S35XH and L50S2X9H. One is connected to an IDE port, and identified as hda, which is as expected. The other (the problem one) has been identified as a SCSI device, as sdh, like a SATA drive. Is it possible that this drive is connected to an IDE to SATA adapter? It does seem to work, but if you have more problems with this drive, then that should be considered a suspect adapter.
July 9, 200817 yr Author Wow, thanks for the responses. It is an older Nforce 3 board, but valuable because of the six ide ports (2 of which are connected to a 3rd party on board bridge, so it's probably showing up as that SCSI device). I'll definitely be upgrading in the next 6 months though. Do you guys have an old copy of 4.2.1 you could throw up temporarily? I found 4.2.4 from the download archive, but I overwrote my copy I'm afraid. About the syslog, I think I overwrote that as well. I'll check the flash drive when I get home this evening to see if I backed them up, though not looking good. I have been running monthly parity checks, so at least that is a plus. I'm a believer now in backing up syslogs now. I remember when first upgrading it was only 3 drives that turned red and moved around. If I have no idea of what the original order of the drivers are (I've also got to make sure I didn't physically label the drives), does that leave us with just regenerating the parity drive with the drives at their new positions? Also, to verify that I have the correct parity drive in the right slot, I was going to do a read only mount. Would the command look something like "mount -ro reiserfs /dev/sdx /testmount" after creating the test mount directory? Thanks again for all your help, it is much appreciated!
July 9, 200817 yr Would the command look something like "mount -ro reiserfs /dev/sdx /testmount" after creating the test mount directory? I believe (subject to correction) that what you want is: mkdir /mnt/testmount mount -r -t reiserfs /dev/sdx1 /mnt/testmount -o umask=111,dmask=000 -v I would refer you to the HowTo at http://lime-technology.com/wiki/index.php?title=Copy_files_from_a_NTFS_drive, except you don't need the modprobe, and you aren't copying, just testing, so you may not need the -o parameters. Don't forget to umount when you are done with each. I do have a v4.2.1 if you still need it. The 3 red balls you saw initially were the 2 drives with long names and the bad drive. Since I think you have pressed the Restore button, the new longer drive names are stored, and that Maxtor is responding correctly, so no red balls. Once you are happy with the drive positions, she is ready to restart, and build a new parity drive, or use Tom's special commands, and it will just start a parity check.
July 9, 200817 yr Author I'm currently rebuilding on 4.2.1 and then backup everything and try the upgrade again. Then I will use this method if I understand correctly to combat the long names: Here's the workaround. Do this only if you know the configuration was completely valid before upgraded, ie, no disabled or missing disks, parity is valid. 1. Boot server normally & observe array not started due to hard drive model being off by at most 4 characters (same s/n though). 2. Click Restore (after first checking "I'm sure I want to do this" box). This will result in all disk status symbols being 'blue'. The server state should be, "Stopped. Initial Configuration". 3. From the console or telnet window, type this command: mdcmd set invalidslot 99 The output of this command should be this: cmdOper=set cmdResult=ok 4. Now click Start. All the disk status indicators should turn Green; the system state should be Started; and, there should be a parity-check in progress. You can let the parity-check complete or cancel it.
Archived
This topic is now archived and is closed to further replies.