April 16, 201313 yr Hi all, I'm running unraid plus 5.0-rc5 (have been meaning to move to recent RC but hadn't got around to it). This evening I noticed that one of the disks in my array was showing lots of errors, and generally looking unwell. Accessing the share for that specific disk was showing lots of missing files with others corrupted etc. I took the array offline, powered down, checked cables etc, and powered back up. The disk is noisy when starting up and while it does eventually spin up and keep spinning the OS isn't seeing it at all. So I have a missing data disk. Ah well, time to buy a new disk, no big deal I thought. However the array refuses to start, saying "Too many wrong or missing disks". The parity disk has a blue dot next to it, which I think means "new" but this is not the case - the parity disk is (as far as I know) absolutely fine. No changes have been made to the array at all since I added a new disk back in December. How should I proceed? It's possible that the parity disk has also failed or got corrupted in some way in which case I will have to give up on the failed data disk's contents and restore from a remote backup, but I'd rather not have to. Is there a way for me to tell unraid that my parity disk is not new so that I can see what I can recover using the parity data? Happy to provide any diagnostics/etc if that helps! Thanks Matt
April 16, 201313 yr As you know, the parity disk should show green if it's okay. Blue, however, is strange, as it indicates it's "new" -- not failed (which would be red). In any event, UnRAID doesn't recognize that you have valid parity (assuming you do) ... and there's no way to force it. I'm afraid you've simply lost the failed data drive and will have to simply replace the drive and restore its contents from your backups. I'd thoroughly test your parity drive as well, however, to be sure you don't have a 2nd failing/failed drive. Either run a pre-clear on it with JoeL's excellent pre-clear script; or remove it and test it in another PC with the manufacturer's disk diagnostics.
April 17, 201313 yr Author Oh well that's a bit disappointing! It also begs 2 obvious questions: [*]how do I proceed from here - my array is in a state where it won't start [*]how do I figure out what went wrong - there's really not much value to me having a parity protected array if the first time I have a disk failure the parity drive also fails! M
April 17, 201313 yr In any event, UnRAID doesn't recognize that you have valid parity (assuming you do) ... and there's no way to force it. I'm not sure that's entirely accurate. I don't know the procedure off the top of my head, but I'm sure I've seen something like it referenced here before. I'd email tom @ limetech before I did anything rash like preclearing the parity drive.
April 17, 201313 yr A bit of searching found several similar problems, so your parity drive may in fact not be bad -- there may simply be some corruption in the flash drive's configuration status. I'd remove the flash drive; put it in another system; and copy the latest syslog (in the logs folder) to that system and post it here. One of the Linux gurus [JoeL, limetech, etc.] may be able to help resolve this so you can get your system starting again. Do you by any chance have a copy of the flash drive from a few days ago when it was working? If so, you may be able to simply copy everything back to the flash drive and reboot.
April 17, 201313 yr Author I will grab the syslog later when I'm back near the server. Not sure if I have a recent backup of the USB drive - suspect not, sadly. Another thing occurs to me (which I guess is more of a feature request, perhaps): It's really annoying that I can't tell what is on each of my disks (without pulling them individually and very carefully mounting them read only on a different machine I guess). Basically I have a choice which is either to blow away this failed disk and its contents, restart the array with one less disk, figure out what data I am missing by spotting the gaps, and then recover/restore stuff I have offsite backups of, etc etc; or to invest time and effort on obscure incantations designed to possibly get parity working again and maybe recover some proportion of the data on the disk. It would be really useful if I could actually get a listing of what was on that failed disk, either by being able to see what's on the other disks (and spotting what's missing) or by having a record of what was on this disk. Does unraid maintain some cache of the filesystem contents so that it can quickly build directory listings? Is there somewhere where this data could be accessed so I can view what used to be on my dead disk? Am I just asking for silly unrealistic things here?!
April 17, 201313 yr You could use MD5Deep plugin from unMenu to get MD5 checksums of all files periodically then store those MD5 checksum files in multiple places - on server and elsewhere. That would give you a list of every file as well as a checksum of each file that you could use to determine if any files are corrupt. Not going to help in your current situation but might with any future problems.
April 17, 201313 yr Author Hmm. I've checked the SMART info for the parity drive, and it has seen a handful of UDMA errors (I think yesterday, though the timestamps are confusing). But it passed both short and long SMART tests, and seems to be working perfectly at the moment. It's on the same controller as the disk that failed so maybe it's conceivable that the death of one caused some transient errors on the other? SMART report attached if anyone is interested. All in all though it does seem likely that my parity data is probably fine though, so I'd quite like it back! smart.txt
April 17, 201313 yr You can "print" directories of each of your disks individually if you want using either Karenware's free Directory Printer [http://www.karenware.com/powertools/ptdirprn.asp ] or Glenn Alcott's utility with the same name [http://www.galcott.com/dp.htm ]. If you have CutePDF installed on your PC (highly recommended ... the free version is all you need: http://www.cutepdf.com/ ], these "prints" can be to PDF files. Note, by the way, that in the more likely case that you have ONE drive fail (unlike your current issue), you can still print a directory of the failed disk. You can, in fact, actually remove a disk from an UnRAID system and STILL access its contents; print a directory; etc. ==> they will all be generated by UnRAID by reconstructing the contents by reading all of the other disks. You can even stream a movie from that disk Of course you shouldn't do this (other than perhaps getting a directory) -- you should instead immediately replace the failed disk.
April 17, 201313 yr By the way, while you CAN keep current directories of all your individual UnRAID disks, keeping those directory lists current can be a fairly intensive effort. I use a different approach that doesn't depend on the directories of the disks to restore any missing data -- and forces me to keep my backups current at the same time (always a good idea anyway). My backups consist of a set of 2GB disks (most recent ones 3GB) that I keep in DriveBox cases [http://www.amazon.com/WiebeTech-DriveBox-Anti-Static-Hard-Disk/dp/B004UALLPE/ref=pd_cp_pc_0 ]. I keep the current backup drive in an external caddy attached to my PC [http://www.newegg.com/Product/Product.aspx?Item=N82E16817153071 ]. Whenever I write files to the UnRAID array, I also write them to the current backup drive. When that drive gets full, I "print" (via Directory Printer and CutePDF) a PDF of its contents; then remove it and store it in a DriveBox; and put the next backup drive in the caddy. I keep my full backup drives in a fireproof/waterproof/data-certified safe. The regimen of copying the file to the backup disk whenever writing to the array takes ZERO extra time -- just the discipline of doing it, since you can start the 2nd copy while the copy to UnRAID is in progress, and the backup drive write is much faster than the write to UnRAID, so it finishes first anyway. All it costs is the set of backup disks -- which you should have anyway !! A backup disk typically lasts me about 12-18 months, but obviously it depends on how much you're adding to your UnRAID array. So if you have this complete set of backup disks, how do you restore ONLY what's missing? Simple -- just use the free SyncBack (or any of the many equivalents) when you need to restore data to the array. You just make a backup "profile" to backup the BACKUP drive to the UnRAID array. You then insert the set of backup drives in the caddy, one-at-a-time, and run the SyncBack profile. It will only copy files that aren't on the array ... so once you've gone through the complete set of backup disks, the array will be current. Works perfectly, and fairly quickly (although obviously copying TBs of data still takes a fair number of hours). Note that you don't have to do this if you have a single drive failure -- you just let UnRAID rebuild the failed drive.
April 17, 201313 yr Author I just run Crashplan and everything gets seamlessly and automatically backed up offsite (well everything that matters anyway). However, since there was about 400GB of data on that disk I would really not have to firstly figure out which 400GB I need to restore, and secondly actually restore it (which will take about 5 days of maxing out my internet connection).
April 17, 201313 yr Since the latest 5.0RC12 release seems to be able to easily show which disk(s) and folders/files are located on (including showing when they are duplicated) via the GUI I am wondering if there is now any command line way to get the same information.
April 17, 201313 yr Author OK well a slightly odd chain of events means that I think I’m OK and have all my data back, though not without a bit of stress along the way. I powered down the system, removed the dead drive, and while I left the unraid box running memtest I put the dead drive in a caddy and fired it up attached to my PC. It worked perfectly – all data seemed fine. Since both the parity and the supposedly-dead drives were on the same SATA controller I guessed maybe it was the problem, so I reinstalled the not-actually-dead disk and attached it to a different controller. Unraid came up, found all the data disks and to my surprise it started the array, even with the parity disk missing. So now my array is up, and my parity drive is definitely not valid any more! Which I guess works out OK… So I’m moving the data off the not-actually-dead disk and will then properly test both that and the parity drive before putting them back into service. And I now have to figure out what to do with the 2 new drives arriving from Amazon tomorrow :-) But I think we’re OK.
April 18, 201313 yr Good, glad all's okay. Sounds like you may have had a loose connection -- simply removing and replacing the SATA cable may have resolved this ... although it certainly doesn't hurt to switch controllers at the same time. Are all your SATA cables locking cables?
April 18, 201313 yr Question r.e. Crashplan => which plan do you have that allows backing up network drives? With over 25TB of data I really don't consider a cloud-based backup viable, but I have toyed with it a few times. How much data do you have backed up?
April 18, 201313 yr Author I just have the standard family plan. The crashplan client is installed on the unraid box itself which means that everything in my array looks like local storage (because it is!). There's a crashplan plugin for unraid which I installed very easily. The only bit that's annoying is configuring it - you have to edit some config files to point the crashplan desktop app from your own PC at the crashplan engine on the unraid box but it's really not that hard. Follow the links/instructions from http://lime-technology.com/wiki/index.php/UnRAID_Plugins I backup music, photos & videos, and other general stuff to crashplan, but don't backup movies, TV shows, recorded TV etc which I could recover in other ways. So probably about 1TB of 10TB total. It's totally set-and-forget, just like unraid (well, mostly like unraid ;-) And I also use it as a backup destination for my PCs around the house (as well as Crashplan central), so if one of them dies I have somewhere fast to restore from.
April 18, 201313 yr Thanks for the info. Even 1TB seems a bit much for a cloud-based setup; but once it's backed up, I suppose the changes over time aren't all that bad.
April 18, 201313 yr Author Yes the upload really isn't something I even think about - it just happens in the background, and even 1GB of new photos or music on the system is only 3 hours uploading so I'd have to be adding huge amounts of new stuff for it to fall behind. IIRC the initial sync took about 30 days which sounds like a lot, but again if it's all in the background it doesn't really matter. The only pain will come if I ever need to restore everything. And in that case a) I'll just go visit somewhere with a very fast internet connection and b) by definition something very bad has happened (like a 2 disk failure!) so it should be very rare...
Archived
This topic is now archived and is closed to further replies.