February 8, 201412 yr I could use some help to avoid data loss. Unraid 5.05 Not sure what is going on....have been wresting with permissions issue on a share, see other thread: http://lime-technology.com/forum/index.php?topic=31744.0 Was changing permissions and then decided to try and open some files and noticed that items on disk8 were not opening .jpg files say invalid file, videos won't play etc. Went to unraid main and see the that parity is disabled and disk 8 and parity both have write errors. Here is a link to the main web gui screenshot. https://www.dropbox.com/s/rimixz5fudrdvfi/Screenshot%202014-02-08%2004.19.19.png Syslog can be found here: https://www.dropbox.com/s/l437d0nbzx5d36p/syslog.zip I stopped the array while waiting for advice....now main shows issues with parity and disk 8. Screenshot: https://www.dropbox.com/s/pqn53c5l08l8u68/Screenshot%202014-02-08%2004.37.46.png Any suggestions as to how to proceed?
February 8, 201412 yr Author The PSU is a corsair 750 watt power supply. I am hesitant to power off the tower until I get some advice or plan on addressing the drive issues. I know from past experience that have two disks disabled is a tricky situation. Dan
February 8, 201412 yr What model PSU specifically? Is it single rail? When you click on the dropdowns that say "no device" is there a device for you to select? If not, I don't think there is anything you can do but try to get the drives recognized again by the hardware. Nothing you can do from the GUI if it can't see the drives. Can you see them from the BIOS?
February 8, 201412 yr Author Corsair 750TX, it was purchased in 2009. I believe this is single rail, but not sure. Is that good or bad?
February 8, 201412 yr The 750TX currently for sale at Newegg says it's single. That's good. Just checking. Often when more than one drive suddenly has problems it is not the drives but some other part of the system. That's why I asked about the PSU and recommended checking the connections.
February 8, 201412 yr Author Thanks truel I rebooted the server (I am not starting the array until I get some help). Disk8 is now recognized and is green. I ran a short smart report on disk8 and it passed. I kicked off a long test and am waiting for results. I am not home right now...but as I recall parity came back and has a blue ball and it states that it is invalid. I am thinking about mounting disk8 in in unmenu to see if the contents are intact.
February 8, 201412 yr Disk8 was green in your original post as well. The errors shown in the error column were read errors -- not write errors. Had they been write errors Disk8 would have been disabled. Normally those errors would have been automatically corrected ... but since you have a disabled disk they couldn't be. If you value your data, shut down the server until you get a new disk for parity. Then install the new disk; boot the system; and either do a parity rebuild, or do a New Config and do an initial parity sync (probably the best choice). This will get out back into a fault-tolerant state and you'll no longer be running "at risk". It's likely, however, that there are some areas on Disk8 that can't be read correctly (thus the errors). The best thing to do with that data is compare all the files on Disk8 against your backups, and replace those that are corrupted. Writing them back should result in automatic relocation of any defective sectors. Alternatively, you could also replace Disk8 with a new disk and then copy the data from the original one to the array from another PC. If you plan to do that, simply don't include Disk8 in the New Config you do with the new parity drive; then add a new drive after you've got the array protected; and finally copy the data from Disk8 ... watching for files that have read issues -- and replacing those from your backups.
February 8, 201412 yr Author Here is the long smart report. It says passed for the smart report, however there are a bunch of errors at the bottom of the report. See attached. Would it be safe to unassign disk8 and then use unmenu to mount it as a read only? Was wondering if this would give me an idea as to the state of the data. disk8_smart_rpt.txt
February 8, 201412 yr The SMART report looks fine ... it's likely there's nothing wrong with the disk. It did abort a few read attempts (thus the errors you saw earlier). The FIRST thing you should do is get your array protected again. If you want to isolate Disk8 from this, then just do a New Config without Disk8; then install a new parity drive; and Start the array and let it do a new parity sync. When that's done, do a parity check to ensure all went well. ... and you then have a protected array again. Then add another drive so you'll have enough space for the data from Disk8. And THEN you can worry about the data from Disk8. As I noted earlier, the simplest approach at that point is to attach Disk8 to a different computer, and simply copy the data back to the array. On a Windows system, you just need to install the free LinuxReader to read the disk [ http://www.diskinternals.com/linux-reader/ ]. Or you could attach it "outside the array" on the UnRAID system and copy the data from there. I'd prefer the Windows approach, as it's easier to monitor the copies and see which files (if any) have errors when you attempt to read them to copy them to the UnRAID array. But I'd definitely get the array protected BEFORE doing anything else. Otherwise any other failure will for sure result in even more data loss.
February 9, 201412 yr Author Gary, What I am wrestling with is the path to take...this is not the first time both parity and disk8 have failed. The weird thing is they have both been replaced since the last round of failures 6 months ago or so. When the disks filed I ran extensive testing and preclearing and never got any errors. I did go ahead and unassign the drive and mounted it as readonly in unmenu and shared the drive. Using windows and the readonly share I was able to backup some critical photos and videos. Last night when I discovered this, the jpg files all said invalid file and the videos would not play (can't remember the error from vlc). The backups of these files that I just copied from disk8 to my windows machine all open perfectly and the videos playback fine. I got no errors in terracopy while moving the files. My motherboard is the old unofficial CSEEE and I use a combination of onboard sata ports and the supermicro SAS AOC mv8 addin board. Dan
February 9, 201412 yr Since you're having issues with the same two logical disks, but different physical units, I'd move them to different SATA ports (if possible) AND use new SATA cables, just to confirm this isn't a port or cable issue. Another possibility is that your PSU has deteriorated some and is having occasional "glitches" on the power bus. If you have a spare unit, you could check that as well. If you think the parity drive is actually okay, you could do a New Config and let parity build back to the same drive -- but recognize that this may fail if there's truly an issue with the drive. These kind of issues are very difficult to pinpoint without some experimentation ... but do ONE thing at a time and you'll find the issue :-)
February 10, 201412 yr Author Thanks Gary, I will have to experiment. I don't have a spare PSU. I found this thread and it got me thinking...I do not have the same card as this thread (I have the AOC SASLP mv8), but it made me wonder if updated firmware might help me as well. http://lime-technology.com/forum/index.php?topic=26719.msg234110#msg234110 For my version of the card firmware 3.1.0.21 is available (I was previously on 3.1.0.15N) So I updated the firmware on the SAS card. That will be my one thing to test for now. I brought the array backup and parity has been rebuilt with no errors and no data was lost. I am running a parity check now. Are there any utilities that will monitor and log voltage? If so that could be a handy tool to diagnose a flaky psu. I am off to search google to see if I can find anything. Dan
February 10, 201412 yr You can get multimeters that will log voltage over time, but they cost more than a good new power supply (several hundred $$) I'm not aware of any monitoring software that logs/graphs any voltages other than the CPU and memory values ... i.e. they don't monitor the 5v or 12v buses. There may be some -- I'm just not aware of them.
February 10, 201412 yr Author Thanks Gary. I see your point on a new psu being cheaper. At this point I am going to mark this thread as solved...hopefully my gremlin will not return! Dan
Archived
This topic is now archived and is closed to further replies.