January 5, 201412 yr I just noticed over the last couple days, I had been redoing some movies and copying the file seemed to be failing, I was running 5.0-rc** since April of this year with no issues. All the drives were new and pre-cleared when the new server was built in fall 2012. I looked at the Unraid main page and noticed several errors noted on the parity drive and a few on disk 1 these are both on the mainboard controller sata port 0 and 1. I shutdown and replaced the sata cable for the parity drive and upgraded to 5.04 at the same time. turned the server back on and started a parity check and the attached syslog and two smart files for those two drives are what I have got as of this morning. I am hoping someone can shed some light on how to proceed with a fix. I am about 2 hours left on the parity check with 0 errors so far, BUT I am concerned with so many errors in the syslog if I can actually trust the parity check. I have 2 3TB in the server that have no data on them. Is it possible to remove these two drives and replace them with 2 TB spares that I have, and then replace the parity drive and the disk 1 drive with the 3TB drives I am removing. Thanks for your assistance in advance. here is a link to the syslog, it was too large to upload as an attachment - http://aimoutboards.com/downloads/syslog.txt EDIT - 10:02am now disk5 is reporting major errors as well. syslog is full of read errors smart report for disk5 is attached all three drives that seem to be failing are WD 3.0TB. EDIT 2 - 12:23pm parity check completed with no errors. parity writes = 11,820 errors = 11,725 disk1 writes = 13,693 errors = 13,683 disk5 writes = 9148 errors = 9138. It has not failed any drives yet and they are all still green. smartparityjan5922.txt smartdisk1jan5922.txt smartdisk5jan5.txt
January 5, 201412 yr Author It is on a pc pwr and cooling (before ocz bought them) Silenx 750 and that is on a Smart UPS RM 2200. Thanks, Dave
January 5, 201412 yr Author It is a silencer 750 EPS12V. One of the more highly rated psu's when they were new. Here is a review and the specs. http://tech-reviews.co.uk/reviews/pc-power-and-cooling-silencer-750w-psu/ Features •750W Continuous @ 40C (825W Peak) •Up to 90% (10dB) Less Noise per Watt •NVIDIA SLI Certified (Dual 8800 GTX) •80+ Certified (83%); .99 Active PFC •+12VDC @ 60A (Powerful Single Rail) •Rock-Solid, Super-Clean DC Output •24-pin, 8-pin*, 4-pin M/B Connectors •Quad PCI-E and 15 Drive Connectors •Automatic Fan Speed Control Circuit •5-Year Warranty and Tech Support Specifications AC Input Operating Range: 90-264 VAC .99 power factor Frequency: 47-63Hz Current: 12A Efficiency: 83% EMI: FCC-B, CE DC Output Output: +5V @ 30A +12V @ 60A -12V @ 0.8A +3.3V @ 24A +5VSB @ 3A continuous = 750W peak = 825W Regulation: 3% (+3.3V, +5V, +12V) 5% (-12V) Ripple: 1% (p-p) Hold Time: >16ms PG Delay: 300ms Safety Over Voltage Protection: +3.3V, +5V, +12V Over Current Protection: 135% OPP Agency Approval: UL/cUL/CE/CB/RoHS Environmental Temperature: 0° – 40°C Humidity: 20% – 80% RH Fan Type: 22 – 55 CFM ball-bearing Noise: 26 – 40dB(A) Miscellaneous M/B Connectors: 24-pin, 8-pin CPU, 4-pin CPU Video Connectors: two 6-pin PCI-E, and two 6/8-pin PCI-E Drive Connectors: 8 SATA, 8 Peripheral, 1 Floppy MTBF: 100,000 hours Power Cord: 6? 14AWG (incl.) Warranty: 5 Years
January 6, 201412 yr Author HI, does anyone have any idea on how to proceed. I am concerned about replacing any of the drives at this point they all show pending sectors well 3 on disk 5, 65535 on parity and 65534 on disk1. I rebooted the server after yesterdays parity check and I have not seen any errors in the log since. thanks, Dave
January 6, 201412 yr Several drives have unreadable sectors and it is unlikely unRAID can recover the drives. unRAID can recover a single failed drive by reading all of the other drives. In this case, the other drives are not readable. The drives are probably fine and will be usable after being pre-cleared. Copy as much data from the problem drives as possible. A recovery CD, e.g., systemRescueCD, may be able to recover more. There is a hardware issue with the server.
January 6, 201412 yr Author I guess my biggest question is how. There has been no signs of trouble with any of the parity checks in the last 6 months. The two oldest items in the server are the psu and the 3 super-micro 5-3 docks. the sas cards are both new last year as well as all cables, the drives that are showing problems are all on the motherboard sata controller and since replacing the sata cable for the parity drive I have not see anymore errors although I am a little reserved about performing another parity sync check. A little lost now, I guess the easiest is to replace the psu and do a memtest. I know with a hardware issue you are not suppose but to satisfy my concerns I did a filesystem check on both disk 1 and disk 5 and they come back fine with no corruption.
January 7, 201412 yr Author I guess I am wondering why you are saying there is a hardware issue when I have had no problems at all in over a year or more, This all started out of the blue after copying a few new movies to the server. Thanks, Dave
January 7, 201412 yr Just because the server ran for years without issue doesn't mean that it hasn't RECENTLY developed a hardware issue. Hardware does fail occasionally, and when it does it can manifest itself in strange ways. My guess after reading through the thread would be an issue of some sort with the on-board SATA controller. You said since you replaced the SATA cable on the parity drive you have seen no further errors. If it stays this way, great. If not, I'd suspect the on-board SATA controller.
January 8, 201412 yr Last week when I would do a parity check, 3 of my drives would all just say missing. I would power off and on and they would be back. I did another parity check (in unraid gui each drive jumped to like 50,000 errors) and then would go missing again. I checked wires, upgraded sas card and same. Eventually it did it again when running parity and finally one of my drives would not come back up after power on and another one got disabled. Replaced the dead drive with another drive and ran special command to reenable the disabled disk. Parity reconstructed onto the new drive and all has been well. Meanwhile that dead drive passed smart with flying colors a few days previous. Once recovered drive, ran disk checks on each drive everything good, ran parity check everything good.
January 8, 201412 yr There have been a few cases in the past with power supplies causing pending sectors to appear on WD drives. I recall it being Antec, but still that could be what's happening. After replacing the power supply all the pending sectors just went away. Your log is indicating media errors, not SATA link errors.
January 10, 201412 yr Author Hey All, Thanks for the help and ideas so far, I grabbed some 3 TB RED drives and setup a temp server and those drives are on there 2nd pre-clear now, I went back and looked at the pre_clear reports for the three drives in question in the array and all of those 3 passed at the time with no issues, checked some sys log files over the last year and the read errors did start about mid year 2013 and looked to get to where they are at now although disk1 shows no errors other than 65535 pending sectors. So onto the next phase. The existing parity drive disk 1 and disk 5 seem to be in question. Since installing the new Seasonic X-850 PSU I have had no errors in the syslog to do with read errors nor errors on the main page. Next STEP - I have user shares setup so I am wondering what is the best practice for moving the data off of disk 1 and disk 5 I have lots of empty space on 4 other hard drives in the array and they are all included disks on each of the user shares TVShows, HDMovies and SDMovies. I know I can use copy or mv to do so BUT I have user shares on disk 1 - /mnt/disk1/TVshows, SDMovies & HDMovies and on Disk 5 - /mnt/disk5/HDMovies, SDMovies. Thanks, Dave
January 12, 201412 yr Author Well I am shocked, I shutdown the server updated the motherboards bios, rebuilt my Kingston flash drive and did a memcheck, rebuilt corrected parity and then replaced disk 5 with a new red 3.0tb and it rebuilt disk 5 overnight with no issues, It was the worst of the smart reports so I did it first had a back up of the flash put aside just in case, I am pre-clearing the old disk 5 now in a test machine. I had ZERO read errors from any disks or in the log while the data was being rebuilt on disk 5. I ran a file check on disk 5 before and after the rebuild and no corruption found. Now I have to replace disk one and the parity drive. I guess the old power supply was the root of my problem. One thing I am completely confused about is that I bought all these 3.0TB WD Greens at about the same time in fact I think 4 of them came from a boxing day sale at Christmas 2011 and they are all 3.0EZRX drives but when I run the serial #'s on WD warranty page only 1 of the 4 drives comes back with a warranty available and as a 3.0TB Green, the other drives serial numbers show as 3.0TB WDessential book drives and they say have no warranty. rgds, Dave
Archived
This topic is now archived and is closed to further replies.