June 21, 201115 yr running 5.0beta6a the check box wont correct the errors.ive ran smart diag for each hdd and someone mentioned that seemed alright ran the memtest all day and night no problems. need to get this corrected. thanks! syslog3.zip smartsdx.zip
June 21, 201115 yr Me too, except I have over 14,000 parity errors Need to figure out wtf is wrong--though I've never run it with Auto-correct. The whole "parity is valid" message throws me off. They should fix that, so, you know your parity ISN'T valid.
June 21, 201115 yr I don't know what to do. Just rebuild parity? I'm not sure how it got messed up, but I feel like it was borked from the beginning...
June 21, 201115 yr The syslog shows issues with disk2 and disk4. The SMART reports are inconclusive. There may be bad or loose cables, bad SATA ports, marginal power supply, bad RAM or MB. Run a memtest overnight.
June 21, 201115 yr Author ran mem test all night and day 21 passes 0 errors ill concentrate on disk 2 and 4 check cables
June 21, 201115 yr Author disk 2 and disk 4 are aligned and the others are unaligned...dont think that has anything to do with it do u?
June 22, 201115 yr Author when i transfer files internally my server stops working. i can still transfer files from my desktop just cant move them around within...can anyone get anything off of this? this happens during internal transfer
June 22, 201115 yr You are not alone... in beta 5.6a i have none. in beta 5.7 i have 186. Array Status STARTED, 8 disks in array. Parity is Valid:. Last parity check < 1 day ago . Parity updated 183 times to address sync errors. I ran parity check with the fix option and it still shows the same (unless i had fewer before i started?). I want to pull the drive i think is "bad" and "rebuild it". Is this a bad idea if it shows parity errors.?
June 22, 201115 yr Author my sys specs? or johnm??? if i restart my server will it clear the syslog?? * this is or a message very similar is constantly on my screen when i reboot server "udevadm settle- timeout of 180 seconds reached, the event queue contains: /sys/devices/pci0000:00/0000:00:11/host3/target3:0:0/3:00:0/block/sdd/sdd1 (944)" although the numbers in the parentheses may be different newest log attatched...sys specs are on each of my post as my signature. syslog4.zip
June 22, 201115 yr Does it matter which drives are involved in the copy? Does it fail no matter which drives are involved? Which PSU? Disk1, disk3 and disk4 are all showing errors in the syslog.
June 22, 201115 yr Author i will test each individual disk and my psu is *CORSAIR Builder Series CX430 CMPSU-430CX 430W
June 22, 201115 yr Hey, reading this thread as I'm having similar issues with parity on my server, can you tell me if you have a cache drive in your configuration ? I ran a parity test before and no issues, added a cache drive and then mover script after a few files were copied to the drive and came back with parity issues... soo... I'm just curious if your configuration includes a cache drive and what size is it ?
June 22, 201115 yr Mine does, but I don't recall when it was added. I've only had the server for about 2 months, so it was within the first week that the cache drive was added. It's a 500GB drive, and all of my User Shares utilize it.
June 22, 201115 yr Thanks for that update, I think that the key is the cache drive (last resort as I also ran memory checks, switched out drives and still having issues parity issues < 10 consistently)... I'm going to disable it and see if i get the same issues.
June 23, 201115 yr Author ok im still doing some more testing but i realized something as i was taking screen shots. i was moving around a folder from disk to disk and got to disk 1,2,3 havent tried 4 but i realized that i got errors during a specific file. its an .iso on a particular episode...any idea why this would be?? and this is what is causing my server to just crash...not normal for sure...
June 23, 201115 yr I'm guessing the sectors on that disk that hold the particular file are bad. Post a SMART report for that drive. Lets see if anything has changed. I would replace and rebuild the drive holding that file and then run several preclears on the suspect disk.
June 23, 201115 yr I looked at "crontab -l", and executed the command listed there for the monthly parity check. However, I received an error about invalid argument? So I kept running it a few times, and eventually went to the webGUI and saw that it was actually running. I then began receiving emails about resync in process--is this what I want? I ran it with NOCORRECT, but resync sounds like it's re-calculating parity from the data disks? Welp, so far (last I checked a few hours ago before I left for work) it was at 90%+, and had 0 errors, so hopefully everything will be corrected. Should I run another parity check through the webGUI page after this one completes to verify there are no errors?
June 23, 201115 yr SMART doesn't show any problems. Replace and rebuild disk 2. Then RMA it or run multiple preclears.
June 23, 201115 yr I'm guessing the sectors on that disk that hold the particular file are bad. Post a SMART report for that drive. Lets see if anything has changed. I would replace and rebuild the drive holding that file and then run several preclears on the suspect disk. I'm guessing the file-system on that disk is in need of repair. (I've see corrupt file-systems crash a server) Re-constructing it onto another disk will only give the replacement disk the same corruption (if it is corrupted) I suggest you check it for file-system errors as described in the wiki: http://lime-technology.com/wiki/index.php?title=Check_Disk_Filesystems
June 23, 201115 yr I looked at "crontab -l", and executed the command listed there for the monthly parity check. However, I received an error about invalid argument? So I kept running it a few times, and eventually went to the webGUI and saw that it was actually running. I then began receiving emails about resync in process--is this what I want? I ran it with NOCORRECT, but resync sounds like it's re-calculating parity from the data disks? Welp, so far (last I checked a few hours ago before I left for work) it was at 90%+, and had 0 errors, so hopefully everything will be corrected. Should I run another parity check through the webGUI page after this one completes to verify there are no errors? The messages emitted using NOCORRECT are absolutely identical to those when actually correcting parity. The NOCORRECT feature was poorly/half implemented. It performs no "writes" to correct the errors it finds, but everything still says and looks on the interface as if it does fix them. Very confusing for most.
Archived
This topic is now archived and is closed to further replies.