Joseph Posted April 2, 2017 Share Posted April 2, 2017 (edited) CORRECTED TITLE: 5 Parity Errors After Every Reboot Hey Everybody, I've been pulling my hair out on this one guys.... on my unRAID box, I keep getting 5 parity errors that are not being corrected after every monthly and even manual parity checks. There are numerous errors in the syslog that I can't make heads or tails out of as well. I rebooted late last night and ran parity check overnight. Can someone take a look and the posted log and advise? Any help is much appreciated!! Write corrections to parity is checked: Last check completed on Sun 02 Apr 2017 10:37:14 AM CDT (today), finding 5 errors.Duration: 10 hours, 24 minutes, 54 seconds. Average speed: 106.7 MB/sec TL;DR: Don't use Marvell HDD controllers. Issue resolved by removing SATA cables from onboard Marvell controller and replacing the SAS2LP-MV8 HBA with a Dell H310, flashed to LSISAS2008 (P20). Edited April 23, 2017 by Joseph Issue Resolved Quote Link to comment
JorgeB Posted April 2, 2017 Share Posted April 2, 2017 Let me guess, you have a SAS2LP? Quote Link to comment
Joseph Posted April 2, 2017 Author Share Posted April 2, 2017 I have the AOC-SAS2LP-MV8. Is that an issue? Quote Link to comment
JorgeB Posted April 2, 2017 Share Posted April 2, 2017 It is for a small number of users. Sometimes it helps to disable vt-d (if enable and you don't need it), other than that check for bios updates and try a different pcie slot if available, if issues persist consider getting an LSI based controller. Quote Link to comment
Joseph Posted April 3, 2017 Author Share Posted April 3, 2017 (edited) 8 hours ago, johnnie.black said: It is for a small number of users. Sometimes it helps to disable vt-d (if enable and you don't need it), other than that check for bios updates and try a different pcie slot if available, if issues persist consider getting an LSI based controller. <sarcasm> grrreeeeaaaaat </sarcasm> I bought this controller brand new right at a year ago too. Grrrr! Anyways, thanks for your response Johnny. I can't remember when this issue started, but I suspect it occurred once my drive array grew from 3 or so drives to the 13 (includes 2 parity & 2 cache) drives I have now. I run VMs, so I need vt-d enabled. The Mobo and the SAS2LP have the "latest" firmware. fwiw, I believe asus has stopped supporting the mobo. Moving the controller to another slot would be rather challenging because of the limitations of the slots and the existing cards installed. Hopefully you can or will be willing to point me in the right direction on a few more related questions: What controller board would you recommend for SATA 3 drives? (I see that supermicro sells the LSI00188 9200-8E) Is there any real harm in having just 5 parity errors? (in other words how many files are affected by each parity error?) How was it determined that controller is at fault? is there a whitepaper, discussion or perhaps some sort of an acknowledgement from supermicro? If this is an known issue, is there a hardware recall? Other than what you suggested, is there any other recourse? Looking forward to hearing your thoughts. Thanks again!! Edited April 3, 2017 by Joseph fixed typo Quote Link to comment
ashman70 Posted April 3, 2017 Share Posted April 3, 2017 I had a SAS2LP controller in my SuperMicro server, I started getting regular errors in my parity check too, then things went crazy. In my case it may have been a compatibility issue between the controller and my backplane but they are made by the same company.... In any case I got rid of it and replaced it with an IT flashed Dell Perc H310 and my issues went away. Quote Link to comment
JorgeB Posted April 3, 2017 Share Posted April 3, 2017 3 hours ago, Joseph said: What controller board would you recommend for SATA 3 drives? (I see that supermicro sells the LSI00188 9200-8E) Any SAS2008 based controller, the 9200-8e is one, but the ports are external. The equivalent internal model is the 9210-8i or 9211-8i, most buy the cheaper IBM M1015/Dell H310/H200 on ebay and then crossflash them to LSI IT mode. 3 hours ago, Joseph said: Is there any real harm in having just 5 parity errors? 1 parity error is one too many, a single error is enough to cause one file to be corrupt after a rebuild (though depending on the file corruption may or not be noticeable, e.g., a single or very few errors probably wouldn't be noticeable on a video file). 4 hours ago, Joseph said: How was it determined that controller is at fault? is there a whitepaper, discussion or perhaps some sort of an acknowledgement from supermicro? I can't say the problem is controller, the driver or the combination of the controller and the hardware used, but I'm 99.99% sure than changing to a LSI will fix your issues. 4 hours ago, Joseph said: If this is an known issue, is there a hardware recall? It's a know issue with unRAID v6, but like I said it only affects a small percentage of users, there's no recall. 4 hours ago, Joseph said: Other than what you suggested, is there any other recourse? Already mentioned the workarounds I know, disabling vt-d is usually the one that works best. Quote Link to comment
EdgarWallace Posted April 3, 2017 Share Posted April 3, 2017 I have installed a SAS2LP in my Backup Server, disabled VT-d and still having parity check errors. I ordered a Dell Perc H200 today and will report back If that card is going to resolve my issues. Btw. is it safe to run the parity check once the new controller is installed with the "Write corrections to parity" option? Quote Link to comment
Joseph Posted April 3, 2017 Author Share Posted April 3, 2017 Thanks Johnny for shedding light on the issue and various workarounds. Edgar, Assuming your hardware is in working order it should be ok to run the parity check and write corrections. That will make them accurately record the parity for your setup. Please let me know what you find out once you get the replacement card so I can purchase one. Thank you!! Quote Link to comment
Joseph Posted April 3, 2017 Author Share Posted April 3, 2017 Does anyone know the difference between the Dell H310 0HV52W and the Dell H310 R1DNH? They look identical except for the sticker with the datamatrix code and the sticker on the chip. 0HV52W http://www.ebay.com/itm/DELL-HV52W-RAID-CONTROLLER-PERC-H310-6GB-S-PCI-E-2-0-X8-0HV52W-/201657131656?hash=item2ef3b3a288:g:FfAAAOSwFdtXxe5r R1DNH http://www.ebay.com/itm/Dell-R1DNH-0R1DNH-PERC-H310-6GB-s-Low-Profile-SAS-RAID-Controller-w-Cable-B4-E-/131803966259?hash=item1eb020eb33:g:ZKkAAOSw9r1WCYXr Quote Link to comment
Joseph Posted April 3, 2017 Author Share Posted April 3, 2017 11 hours ago, ashman70 said: I had a SAS2LP controller in my SuperMicro server, I started getting regular errors in my parity check too, then things went crazy. In my case it may have been a compatibility issue between the controller and my backplane but they are made by the same company.... In any case I got rid of it and replaced it with an IT flashed Dell Perc H310 and my issues went away. Which one is it, the 0HV52W or R1DNH? (See my other post below with links) Quote Link to comment
ashman70 Posted April 3, 2017 Share Posted April 3, 2017 No idea, I'd have to open my server and physically look at my card. Are you connecting the HBA to a backplane or directly to drives? Quote Link to comment
Joseph Posted April 3, 2017 Author Share Posted April 3, 2017 8 minutes ago, ashman70 said: No idea, I'd have to open my server and physically look at my card. Are you connecting the HBA to a backplane or directly to drives? Each cable connects to a drive cage with SATA drives installed. I believe this is equivalent to connecting directly to the hard drives. Quote Link to comment
JorgeB Posted April 3, 2017 Share Posted April 3, 2017 Either will work after being flashed to IT mode. Quote Link to comment
Joseph Posted April 3, 2017 Author Share Posted April 3, 2017 (edited) Is there an online guide you can point me to for the correct firmware and how to flash? I found one online that was somewhat confusing and in the comments a couple people said their H310 was bricked or had problems. Edited April 3, 2017 by Joseph additional thoughts Quote Link to comment
JorgeB Posted April 3, 2017 Share Posted April 3, 2017 http://lime-technology.com/wiki/index.php/Crossflashing_Controllers#LSI_SAS2008_chipset Quote Link to comment
Joseph Posted April 3, 2017 Author Share Posted April 3, 2017 AWESOME! Thanks again, Johnnie. Quote Link to comment
crowdx42 Posted April 3, 2017 Share Posted April 3, 2017 So is there a way within unRAID to stress the SAS cards? I have two Supermicro AOC-SAS2LP-MV8 and over the weekend my whole box became unresponsive during a parity check. I ended up doing a hard reset which I then believed caused parity errors (200 or so). My worry now is how to know if it is the cards, a psu issue (19 hdds and 2 SSDs on a Corsaid 850 psu). My unraid server is headless and so I am wondering can it be tested without attaching a monitor and keyboard. Thoughts? Patrick Quote Link to comment
ashman70 Posted April 3, 2017 Share Posted April 3, 2017 I had all kinds of random parity errors, and I have 30 drives in my server. In my case it was the SAS2LP card, once I replaced it the random parity errors went away. You know your server better then any of us, so I am sure you can make an educated guess as to where the real problem lies. Quote Link to comment
Joseph Posted April 3, 2017 Author Share Posted April 3, 2017 fwiw, I just pulled the trigger on the eBay one marked HV52W. Fast ship -- the card should be here by this Thursday. Will let you know how it goes. Quote Link to comment
crowdx42 Posted April 3, 2017 Share Posted April 3, 2017 Well I just upgraded my server with a new case and added additional hard drives. The previous weekend the parity check ran fine with no errors, it was this weekend that it had issues. I am also wondering about heat, the new box has the drives running a little hotter and so I would assume the SAS cards are also running a little hotter. Also, in the old setup I only had a single SAS card, I added a second one in the new setup to facilitate the extra drives. Quote Link to comment
Joseph Posted April 3, 2017 Author Share Posted April 3, 2017 10 minutes ago, crowdx42 said: My worry now is how to know if it is the cards, a psu issue (19 hdds and 2 SSDs on a Corsaid 850 psu). The only stress test I know of is the preclear test (which as you know, you will lose data.) The thing is, if you're stress testing with faulty hardware, you won't be doing your drives or your mental state any favors. I'd re-seat all power and data cables. I did this and wound up replacing some low quality Y power connectors with higher quality ones and found a couple of items that weren't seated properly on my build. DOH! Quote Link to comment
Joseph Posted April 3, 2017 Author Share Posted April 3, 2017 Anybody wanna buy a gently used SAS2LP-MV8? I'll marked it down 5%... 1% for each parity error I'm receiving! Quote Link to comment
Joseph Posted April 7, 2017 Author Share Posted April 7, 2017 (edited) OK Guys, the H310 HV52W arrived today, however I tried it in 2 boxes and when the card is installed I get no post, no beep. The green led one the card is blinking continually. Any thoughts on what needs to be done so I can get the card flashed for unRAID? Edited April 7, 2017 by Joseph clarification Quote Link to comment
Joseph Posted April 7, 2017 Author Share Posted April 7, 2017 (edited) Just read up about masking the pins... trying that now. Crazy, but it worked. just flashed the card. Its no longer seen on boot; hope I didn't just Bork it! Gonna test in unRAID box. PEEEOOOWWWW...Success! The boot time was so much faster I thought for sure the card was bricked. *whew* Will run a couple of parity checks and report back. NEW CARD: DID NOT FIX PROBLEM!! FAIL!! FAIL!! FAIL!! FAIL!! FAIL!! FAIL!! FAIL!! FAIL!! FAIL!! FAIL!! FAIL!! FAIL!! FAIL!! FAIL!! FAIL!! FAIL!! FAIL!! FAIL!! Edited April 8, 2017 by Joseph FAIL! Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.