Drabert Posted January 22, 2013 Share Posted January 22, 2013 I replaced a failed drive in my system with one that I did not preclear or format. I have been fighting parity errors for the past week and every night when my cache drive trys to write to the array the system basically craps out. Just when i thought i figured out the drive that was causing the issues, I now have another drive coming up as failed. I disabled the new failed drive on the array and now the one that i replaced pops up saying that its an unformatted disk and asking me if I want to format it. I am currently in the process of preclearing a drive to replace the one that just came up with a red ball. So my question is what order should i do to help try to keep my data? Do i swap the bad drive with the one i preclear first, then preclear the one that is asking for me to format and put it back in the array? or do I leave the dead one down and use this precleared drive for the one that is asking to format? Sorry if my question is a bit jumbled, but my head is pretty jumbled right now too. unraid version 5.0 beta 13 Thanks Drabert Quote Link to comment
Joe L. Posted January 22, 2013 Share Posted January 22, 2013 I replaced a failed drive in my system with one that I did not preclear or format. I have been fighting parity errors for the past week and every night when my cache drive trys to write to the array the system basically craps out. Just when i thought i figured out the drive that was causing the issues, I now have another drive coming up as failed. I disabled the new failed drive on the array and now the one that i replaced pops up saying that its an unformatted disk and asking me if I want to format it. I am currently in the process of preclearing a drive to replace the one that just came up with a red ball. So my question is what order should i do to help try to keep my data? Do i swap the bad drive with the one i preclear first, then preclear the one that is asking for me to format and put it back in the array? or do I leave the dead one down and use this precleared drive for the one that is asking to format? Sorry if my question is a bit jumbled, but my head is pretty jumbled right now too. unraid version 5.0 beta 13 Thanks Drabert our head is jumbled too. Post a screen shot AND a syslog. You are 1 mistake from loosing data. (I hope you have backups of anything important, because if you do not, now is a good time to make them BEFORE you do anything with moving around disks.) When replacing disks they NEVER have to be formatted, so that alone is an issue. DO NOT FORMAT ANY DISK!!!!! Not unless it is an ADDITIONAL drive being added to a working existing array. (in other words, as an example, you had 7 drives, and are now adding an 8th to a working array with no problems) DO NOT SET A NEW DISK CONFIGURATION EITHER... That will immediately invalidate parity. (And right now, I think that is about the only thing keeping you from losing some data) Quote Link to comment
Drabert Posted January 22, 2013 Author Share Posted January 22, 2013 My current syslog file is 602 MB and will only compress down to 80. How many lines should i cut it down to so i can upload it? Thank you for getting back to me so quickly. Drabert Quote Link to comment
dgaschk Posted January 22, 2013 Share Posted January 22, 2013 Put the zipped version on google drive or dropbox. Quote Link to comment
Drabert Posted January 22, 2013 Author Share Posted January 22, 2013 dropbox downloaded and here is the link https://www.dropbox.com/s/ojh9gne4ek3ja4f/syslog-20130121-195203.zip Thank you! Quote Link to comment
Drabert Posted January 22, 2013 Author Share Posted January 22, 2013 I am currently on the last step of the formatting and it seems like it will finish sometime later tonight. Should i stop my array and use the new pre-cleared disk for Disk 9 on my system since it is coming up missing? After that i can start working on why the other disk is coming up as "not formated" Drabert Quote Link to comment
Joe L. Posted January 23, 2013 Share Posted January 23, 2013 I am currently on the last step of the formatting and it seems like it will finish sometime later tonight. Should i stop my array and use the new pre-cleared disk for Disk 9 on my system since it is coming up missing? After that i can start working on why the other disk is coming up as "not formated" Drabert the preclear script DOES NOT format the disk. It writes zeroes to it and then puts a small signature in the MBR that unRAID recognizes it has been zeroed. Your approach is as good as any. Just do not format the disk that is coming up as "not formatted" unless you no longer want any of the data that was stored on it. Joe L. Quote Link to comment
Drabert Posted January 23, 2013 Author Share Posted January 23, 2013 Ok, preclear finished and i have replaced the bad drive with the precleared one and it is now rebuilding. Once this drive is finished rebuilding, what should my next course of action be? perform a parity check on the system? fail the drive that was showing unformated and have the system rebuild it with a fresh drive? Over the past two years using unRaid, this is the first time i have had any issues like this. I have been fighting through a bad chassis for the longest time, but now the WAF is very low since our entire library does not seem to want to stay available. I am not sure if it is the mover script trying to write to the bad drive that causes everything to get horked, but at this point, i would rather lose a TB of data (its all TV/Movies) than bad WAF added current screenshot Drabert Quote Link to comment
Drabert Posted January 23, 2013 Author Share Posted January 23, 2013 Ok my rebuild finished but now im getting write errors between the drive that was showing that it needed to be formatted and my parity drive. I am also not able to access the share that has my media files on it. It is showing all the folders as empty. Is it safe to replace the drive that looks like it is failing with the hope that the parity drive will rebuild it? New syslog attached Thanks for the help guys. Drabert syslog-20130122-201523.zip Quote Link to comment
Drabert Posted January 23, 2013 Author Share Posted January 23, 2013 The system locked up again so i forced it to do a reboot and now another drive has failed. I started to rebuild it with yet another drive, but im not sure when this is going to stop. The drive i am replacing has not been written to yet so i was not worried about letting it get rebuilt. After this drive is finished rebuilding should i shut down the raid and do a reboot before the system gets locked up? It shows the errors are from disk md4, but i am not sure which one that is on my array. Drabert Quote Link to comment
Joe L. Posted January 23, 2013 Share Posted January 23, 2013 The system locked up again so i forced it to do a reboot and now another drive has failed. I started to rebuild it with yet another drive, but im not sure when this is going to stop. The drive i am replacing has not been written to yet so i was not worried about letting it get rebuilt. After this drive is finished rebuilding should i shut down the raid and do a reboot before the system gets locked up? It shows the errors are from disk md4, but i am not sure which one that is on my array. Drabert md4 = disk4 Quote Link to comment
Drabert Posted January 24, 2013 Author Share Posted January 24, 2013 Well after the disk rebuilt, i did a restart on the system and another disk is now showing as a red ball. This is the first one that has come up bad with data on it. Should i just keep replacing the drives that are coming up red or is there anything else i can do since this drive was fine? I tried another restart of the system but it is still coming up red. I will try powering the system down for a few minutes to see if that clears something up. These are all server grade Hitachi drives that keep failing. So far this is the fourth drive that has come up with a red ball after a reboot. As always, thank you for your help. Drabert Quote Link to comment
BLKMGK Posted January 24, 2013 Share Posted January 24, 2013 Is there anything common among all these drives? Controller card or cables? It seems very fishy that they're all failing like this, any cooling failures recently? Quote Link to comment
Joe L. Posted January 24, 2013 Share Posted January 24, 2013 You seem to have at least 11 disks on your server, what specific make/model power supply are you using? Quote Link to comment
Drabert Posted January 24, 2013 Author Share Posted January 24, 2013 i have 2 Dell 750W redundant power supplies in the system. Both are showing a green light for them as well. Quote Link to comment
Drabert Posted January 24, 2013 Author Share Posted January 24, 2013 will my data be safe if i just rebuild the array on another new drive? maybe these dell systems do not like hitachi drives for some reason? Quote Link to comment
lionelhutz Posted January 24, 2013 Share Posted January 24, 2013 Is there anything common among all these drives? Controller card or cables? It seems very fishy that they're all failing like this, any cooling failures recently? I'd agree with that. Something is very wrong with the hardware for drives to just keep dropping out like you're seeing. Quote Link to comment
Drabert Posted January 24, 2013 Author Share Posted January 24, 2013 Is there anything common among all these drives? Controller card or cables? It seems very fishy that they're all failing like this, any cooling failures recently? I'd agree with that. Something is very wrong with the hardware for drives to just keep dropping out like you're seeing. Sorry i missed that question from BLKMGK- They are plugged into a LSI SAS9211-4i card with one SAS cable to a dell 2U chassis that has a back plane so the drives are swapable. it looks like a dell r720xd. I have not had any thermal issues since this summer on the original box. Quote Link to comment
Drabert Posted January 24, 2013 Author Share Posted January 24, 2013 Right now im just worried about data and not hardware (as long as it eventually stops)... I can swap out this drive with another spare and just RMA this one. As of right now, i have not had a disk come back as "bad" just new ones everytime. So I will swap this one out too and see which one fails next. It does seem though like it is the hitachi drives that are failing on me now, not the Dell POS ones. Quote Link to comment
Joe L. Posted January 25, 2013 Share Posted January 25, 2013 i have 2 Dell 750W redundant power supplies in the system. Both are showing a green light for them as well. A better question is then... What is the capacity in amps of the 12 volt rail powering your disks? With all the various disk errors, on different disks, a common cause is a power supply unable to keep up with the current demands. Joe L. Quote Link to comment
Drabert Posted January 25, 2013 Author Share Posted January 25, 2013 Its only drawing 250 Watts out of 750. Quote Link to comment
JonathanM Posted January 25, 2013 Share Posted January 25, 2013 Its only drawing 250 Watts out of 750. That wasn't the question. How many amps (or watts) are available to the drives? Quote Link to comment
Drabert Posted January 25, 2013 Author Share Posted January 25, 2013 this is from DRAC power monitoring. Sorry but im not good with power stuff: Drabert Quote Link to comment
dgaschk Posted January 25, 2013 Share Posted January 25, 2013 What is the exact model number of the PSUs? Quote Link to comment
Drabert Posted January 25, 2013 Author Share Posted January 25, 2013 http://www.dell.com/downloads/global/products/pedge/en/Dell_Poweredge_R515_2P_4122_750W_Energy_Star_Data_Sheet.pdf DELL FN1VT 750 Watt Switching Power Supply For PowerEdge R510 R515 Powersupply Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.