May 18, 200818 yr Hey All, this is how I got into trouble the last time with a failed drive. So I have a tv series of 5 season and 25-30 episodes. I usually have no problem making the driectory on the server and then the season folders. Then I copy the files over per season and rename them to what I need them to be to scan them later. I did that this morning with season 1 got to about the 8th episode and then came up with "network connection no longer available" so I rebooted the worksations and then tried to stop the array - Hang - tried a shutdown -r from terminal - HANG - hard reboot only - now a parity check is in progress. The array is present in windows and all drives are there, I can access the GUI and it is performing it's parity check - Although it took a long time to mount my two new 750Gb drives about 2 minutes per drive before it would start the parity check. Now if I goto the network in windows and try to access the folder for that show on that drive - windows explorer hangs and crashes (not responding) I cannot access the folder on the unraid server to try and do anything with those files for that TV show and obviously cannot access the folder to delete it either. I can however access the file on the workstation it resides and I can play it back from this machine across the network in Media Player and it plays just fine. This is why I assume it is an issue with the server. p5pe-vm celeron 802D, 512mb ram - 2 sats 300TX4 card and the kingston 512 flash - Unraid 4.2.2 pro. 10 drives. Any Thoughts. Thanks, Dave
May 18, 200818 yr Author Update - I have tried to transfer the same files all 5 seasons from one workstatiion to another and it transfered no problem in half the time as it took trying to transfer one season to the server. All machines are on the same workgroup/network. I forgot to mention that the PSU in the server is a Silencer 610. Regards, Dave
May 18, 200818 yr Are you transferring the files to "user-shares" or to "disk shares" ? Are you transferring the files all to the top level folder, or to sub-folders? What exact version of unRAID are you using? Reason I am asking all this is that it sounds as if you are using up all the RAM in your server in trying to create the new files and/or folders in memory. Once you run out of memory, all bets are off. I would strongly suggest upgrading to version 4.3-b6 (or 4.3 final if it is available by the time you upgrade) as the entire user share system was re-written to be more memory efficient. I'd also suggest you transfer your files to disk shares and see if the same problem occurs. Those would be //tower/disk1, //tower/disk2, etc. Those would not have any issue with memory allocation for shares. It tended to lock up if too many top level folders and/or files were created. Joe L.
May 18, 200818 yr Author Hey joe, I am using 4.2.2. I am transferring to //tower/disk4/tv_name_folder/Season1 etc... and yes I normally create the top level folder is TV show name and then create the sub folders ie season1, 2, 3, etc,,, then I copy the files over per season and then rename them to show name s01e01 - show name.avi. I have only had this happen one other time. I did try to transfer all 5 seasons over the first time to //tower/disk4/show name directory and it hung windows expolorer. Regards, Dave
May 18, 200818 yr Author Joe, So I think I will slap in another GB of ram into the server, that may help. I am wondering though to upgrade to 4.3. It is in the middle of a parity check. Is it safe to shutdown. Install the new ram. What files do I have to copy to the flash for 4.3 or what is the most painless process, I know I did this before from 3.1 to 4.2.2 I think I just copied bzroot and the config file but I am not sure. Thanks, Dave
May 18, 200818 yr Author Another Thought, I have an ASUS-M2NPV-VM sitting here doing nothing with an AMD 4200X2 and 2GB of pc2 6400 that I have been trying to sell of. It has 4 sata on board and the marvell Gb Enet controller What should I do if I want to install this board and replace the ASUS P5PE-VM - Do I Just note the hard drives positions and serial numbers. Install the board boot and setup drives and let it parity check. Is Unraid stable with AMD processors at this point. Regards, Dave
May 18, 200818 yr All you should need to copy are the bzroot and bzimage from the newest 4.3-b6 release to upgrade. You will probably want to want to wait till the parity check is complete if you had to force re-boot when it locked up on you. Once it is finished and back to normal, before you transfer files again, do the upgrade. Before you shut down the current array, make note of the disk serial/model numbers, since sometimes different versions of linux scan and name the drives in a different order. When you start back up, you can check the devices page to put them back in the same order if need be. When you start the array back up, if parity was good before you shut down, you should not need to re-check it, it will still be good, even if you swap out the motherboard. I don't know of any issue with AMD processors... you should be fine. Joe L.
May 18, 200818 yr I had the exact same problem when I was running 4.2.1. I could transfer a few gig & then the unRAID box would die. Totally lock up. This would only happen if I mapped the share to a network drive. If I transferred directly to the server I never had the problem. I upgraded to the version in my sig & have no problems now. I have a hard drive that doesn't like to wake up but that is not unRAIDs fault. Phil
May 18, 200818 yr Author Thanks for the notes, guys I will attempt that as soon as the parity is finshed. Regards, Dave
May 19, 200818 yr Author Well I did all we discussed. Installed the AMD and new 2Gb of ram. installed 3 new fans behind the hard drive cages. all drive temps are now 28-35 degrees. booted up last night about 11PM. It started a parity check again after a reassigned the drives. let it run until this morning all were green. fresh boot had access to the folder that was causing problems. Deleted the file that was an issue as well. tried to copy one season over and did not get even a copy window started and it gave me a network connection no longer available error and again hung or maxed out the network and hung it up. I know if I try to stop the server now it will want to do a parity check again when I reboot. I am attaching a txt file with the log from last night when I booted until now. Thanks in advance. I have had this issue only a couple times where windows crashes explorer on AVI files in abundunce, I did some research and found an on and off bat file that enables and disable a key in the registry and that usually works as well. I setup the old board on a test bed with a CD and I am running Ubuntu 7.10 live to try and at least access that problem folder on the server, I can browse to the folder and open it then Ubuntu says it can not display the files. is there a command to release the network buffer or whatever memory is maxed without rebooting. Regards, Dave
May 19, 200818 yr I believe you have some corruption in the Reiser file system, that may be causing the problems. You had a system crash that involved the reiserfs subsystem, and that crash probably is what you have been seeing when you would lose access to your unRAID server. I note that you had v3.1 at one time, and this kind of reiserfs crash looks similar to the problems a few users have had, with a somewhat rare incompatibility between the Reiser file systems built in unRAID v3 and the Reiser systems built with v4 and up. I suggest that you run reiserfsck on ALL of your data drives (not the parity drive), and if any pass instructs you to rerun with a special parameter, such as --rebuild-tree or --fix-fixable, do so, exactly as instructed. If it were my system, I would unassign the parity drive until all of the data drives are working correctly, and passed all reiserfsck tests, then afterward re-assign the parity drive and rebuild parity. But that is your choice. It will make a difference as to how you run reiserfsck. If you run with parity, then run reiserfsck on md1 to md9, see this thread: http://lime-technology.com/forum/index.php?topic=463, and this Wiki article: http://lime-technology.com/wiki/index.php?title=Check_Disk_Filesystems. If you run these tests without parity, then Stop the array and get the 9 Device ID's from the Devices tab, and add a digit one, (eg. /dev/sda1,) for each of the data drives (eg. reiserfsck /dev/sda1). When attaching a syslog, I recommend capturing the entire file, according to the instructions on the page in my sig. It has a lot of additional and helpful info, without all of the word wrap.
May 20, 200818 yr Author Hey RobJ, Thanks Chum for your quite in-depth reply. Nice Touch!!!. Anyhow, I chose the easier route and did a resierfsck on the drives as per the post and it only found one error on MD4 and I used the --fix-fixable command and it fixed it right up. I was a little concerned about doing it without the parity drive, I was not certain if by adding a digit the numbering system grew - IE /dev/sda1, /dev/sdb1 or sdb2 etc... Is there an article on doing it with the array not running and with the drive un-assigned. Now that I am typing this I see!! I think there is only one sda and only one sdb and so on, so I guess it would have been /dev/sda1 and /dev/sdb1 and dev/sdc1 etc... I should have typed this to myself last night. I had the system up this morning, chose to install Total Commander to transfer the files, I transfered over 1 season of 30 175Mb episodes and it went ok, so I transferred the other 4 seasons at the same time and all went well. Just to test when the copy was finished I stopped and started the array a couple times to make sure all was ok, it would normally hang there if the problem existed. Can you run resiserfsck on the whole system? or is it one at a time like I did. IE reiserfsck /dev/md1 is there a command line such as reiserfsck /dev/md1/ dev/md2 etc.. or the like that it would run on the whole file system and then report any issues and then you could run the repair on whichever drive was at fault. Thanks Again, Dave
May 20, 200818 yr Make sure you remember to use the "Check" button to re-calculate parity, especially if you used reiserfsck on the /dev/sda1, etc drives. If you do not, then the parity is now out of sync for sure, since you changed a data drive while the parity drive logic was not looking. Glad you found the basic cause of your transfer problems. Expect the parity check to find mis-matches the first time it is run. Expect to see no errors when you run it a second time. Joe L.
May 20, 200818 yr Author Hi Joe L, I did Not run the reiserfsck with the parity drive un-assigned, I actually did it as per the post the RobJ mentioned above. I did not stop the array but stopped samba and then umounted /dev/md1 through /dev/md9 performed the check on each data drive and then restarted samba as per the post. I thought that parity stayed in sync if it was done that way and no need for another parity check. Is this not correct. So should I stop the array and perform another parity check now, and then what about the data I installed this morning, will it be included in the parity check or will it be out of sync on md4 as well. This is how I ended up loosing the last drive and having to rebuild the data from scratch. Regards, Dave
May 20, 200818 yr Hi Joe L, I did Not run the reiserfsck with the parity drive un-assigned, I actually did it as per the post the RobJ mentioned above. I did not stop the array but stopped samba and then umounted /dev/md1 through /dev/md9 performed the check on each data drive and then restarted samba as per the post. I thought that parity stayed in sync if it was done that way and no need for another parity check. Is this not correct. You are correct, doing reiserfsck on the "md" devices keep parity updated as the repairs to the file-system are made. There is no need for an additional parity check. So should I stop the array and perform another parity check now, and then what about the data I installed this morning, will it be included in the parity check or will it be out of sync on md4 as well. Leave your array running. No need to stop it. The data you added this morning is fine. If you want to be absolutely certain, you can press the "Check" button on the management web-page while the array is running. It will take some number of hours to complete, but it will check and correct any possible problems. You should not see any errors, and if you do, you should not see any errors when a parity check is run a second time. Joe L.
May 20, 200818 yr Author WHEH!!! Thanks Joe L I thought I was in for another long night. Regards, Dave
Archived
This topic is now archived and is closed to further replies.