crowdx42 Posted April 10, 2017 Share Posted April 10, 2017 Yes, I have noted they Supermicro AOC-SAS2LP-MV8 does get hot. I wonder if moving my parity disks to separete controllers helped resolve my issue. I do have a server chassis also which is pretty good airflow. It makes sense that parity checks would uncover these types of issues as it is not often in a daily unRAID setup that all drives are spun up and actually working. Quote Link to comment
Joseph Posted April 10, 2017 Author Share Posted April 10, 2017 Is there a read-me-first-before-building-an-unRAID-box stickie somewhere that has a list of hardware to avoid? Quote Link to comment
crowdx42 Posted April 10, 2017 Share Posted April 10, 2017 Just now, Joseph said: Is there a read-me-first-before-building-an-unRAID-box stickie somewhere that has a list of hardware to avoid? Well with the number of people who are not having issues with the Supermicro AOC-SAS2LP-MV8 , it would probably be on the list. I actually bought two new ones due to the first one I had in my previous build worked perfect for over 2 years. Quote Link to comment
DoeBoye Posted April 10, 2017 Share Posted April 10, 2017 (edited) 1 hour ago, johnnie.black said: It's an issue with linux, mostly on latest kernels, mostly with vt-d enable and it doesn't affect every user. I used to be in the "doesn't affect every user" group. I was aware of the problem, but foolishly thought I was immune . Then my onboard Marvell controller had a hissy fit, and I ended up losing 3 drives connected to it. I recovered 2 with dual parity, but the third had some serious issues that xfs_repair could not fix. Managed to recover a bunch of files through recovery software, but not a fun task . Moral of the story: Get off Marvell controller asap! Especially if you're using vt-d. For the cost of a used Dell 310 controller, I could have avoided the hours I've spent recovering drives and content. Sigh. Live and learn Edited April 10, 2017 by DoeBoye Quote Link to comment
Joseph Posted April 10, 2017 Author Share Posted April 10, 2017 Here's my known list : Hardware Type Mfg Issue P8Z68 DELUXE/GEN3 Mobo ASUS Potential time out issues with Marvel controller causing parity errors. Disable vt-d or avoid using Marvell ports. AOC-SAS2LP-MV8 HBA Supermicro Potential time out issues with Marvel controller causing parity errors. Disable vt-d or use preferred HBA LSI2008 Quote Link to comment
JorgeB Posted April 10, 2017 Share Posted April 10, 2017 Follow up, this is what I would recommend: Parity - currently on onboard intel port1 (sata3) - move to free intel port 3 (sata2) cache2 - currently on onboard intel port2 (sata3) - leave as is cache - curtrently on maverll - move to intel port1 (sata3) parity2 - currently on marvell - move to free intel port4 or free LSI port Quote Link to comment
Joseph Posted April 10, 2017 Author Share Posted April 10, 2017 (edited) 7 minutes ago, johnnie.black said: Follow up, this is what I would recommend: Parity - currently on onboard intel port1 (sata3) - move to free intel port 3 (sata2) cache2 - currently on onboard intel port2 (sata3) - leave as is cache - curtrently on maverll - move to intel port1 (sata3) parity2 - currently on marvell - move to free intel port4 or free LSI port Thank you SO MUCH for this! If I can't get to it this evening, it will have to be this weekend. Regardless, I'll keep this thread posted. Edited April 10, 2017 by Joseph grammar Quote Link to comment
Joseph Posted April 11, 2017 Author Share Posted April 11, 2017 So it seems like there's the parity issues are actually caused when shutting down... The parity check I've been running today (after last nights reboot which had errors) has come back clean. @johnny.black, I'm not going to have time to move the HDDs off the on-board Marvell controller until this weekend. However, just for fun I'm going to reboot and run the parity check again, overnight... if things happen the way I anticipate, then it will have 5 errors around the 50% complete mark. Then I'll do another check tomorrow without a reboot and see if its ok. Unless, I've missed something, this pattern will confirm its definitely happening on reboot. I'm sure the drives love the fitness test. Quote Link to comment
JorgeB Posted April 11, 2017 Share Posted April 11, 2017 Sure, do it, it would good to confirm if errors appear only after a reboot, it may help diagnose other users with similar issues. Quote Link to comment
Joseph Posted April 11, 2017 Author Share Posted April 11, 2017 8 hours ago, johnnie.black said: Sure, do it, it would good to confirm if errors appear only after a reboot, it may help diagnose other users with similar issues. as suspected, 5 parity errors after reboot. Apr 11 03:05:15 Tower kernel: md: recovery thread: Q corrected, sector=3519069768 Apr 11 03:05:15 Tower kernel: md: recovery thread: Q corrected, sector=3519069776 Apr 11 03:05:15 Tower kernel: md: recovery thread: Q corrected, sector=3519069784 Apr 11 03:05:15 Tower kernel: md: recovery thread: Q corrected, sector=3519069792 Apr 11 03:05:15 Tower kernel: md: recovery thread: Q corrected, sector=3519069800 Next steps, move drives off of the on board Marvell controller. wash. rinse. repeat. Won't be able to get to until this weekend. Running one more check now without reboot to confirm the pattern. Quote Link to comment
Joseph Posted April 11, 2017 Author Share Posted April 11, 2017 (edited) It is interesting to note, someone else had a similar issue 3 years ago with a similar pattern of sectors in question: MINE: sector=3519069768 sector=3519069776 sector=3519069784 sector=3519069792 sector=3519069800 THEIRS: sector=1565565768 sector=1565565776 sector=1565565784 sector=1565565792 sector=1565565800 source: https://forums.lime-technology.com/topic/30016-monthly-parity-check-found-5-errors/ Edited April 11, 2017 by Joseph typo Quote Link to comment
crowdx42 Posted April 11, 2017 Share Posted April 11, 2017 Well I just finished a parity check on my system and I have 141 errors. Running it again to see if I get the same after a second check. From what I see, the SATA on the MSI H97 is all intel. So that only leaves the SAS cards. If I get more errors tomorrow I will just bit the bullet and replace them. Is there any plug and play replacement options without the need to flash the cards to a different bios? Quote Link to comment
Joseph Posted April 11, 2017 Author Share Posted April 11, 2017 AFAIK, the Dell card needs to be flashed with LSI2008 firmware so it will play nice with unRAID. Its fairly easy to do; the hardest part was cutting a sliver of electrical tape to cover the pins. I suppose you could buy an LSI2008 branded card and not have to go through the process. I don't knowif there are other cards that work natively, hopefully someone will post feedback. Quote Link to comment
crowdx42 Posted April 11, 2017 Share Posted April 11, 2017 Well I just checked my SAS cards and they are on the latest firmware but they had INT13 enabled which according to the Limetech hardware forum sticky it should be disabled. I have not disable vt-d yet as I may want to play with some virtual machines, so fingers crossed I will re-run the parity check and see what I get. Quote Link to comment
crowdx42 Posted April 11, 2017 Share Posted April 11, 2017 So it seems once INT13 is disabled it REALLY slows the cards down to treacle. I looked it up and it relates to SCSI, I am now wondering should a SATA setup have INT13 enabled? Thoughts? Quote Link to comment
crowdx42 Posted April 11, 2017 Share Posted April 11, 2017 Well regardless, I just bought a couple of the Dell 310s on ebay. They seem to spec a little faster than the Supermicro cards (6gbps vs 5gbps). It just seems the more I read the more the Supermicro cards make me nervous Quote Link to comment
DoeBoye Posted April 11, 2017 Share Posted April 11, 2017 I just picked up a Dell 310 fr 2 hours ago, crowdx42 said: Well regardless, I just bought a couple of the Dell 310s on ebay. They seem to spec a little faster than the Supermicro cards (6gbps vs 5gbps). It just seems the more I read the more the Supermicro cards make me nervous Good call. It's not fun if the Marvell card dumps/corrupts a bunch of drives. I picked up a used 310 off Ebay a few weeks ago. Once I flashed my Dell 310 with fireball3's script, it runs perfectly. No need to cover any pins with tape on mine. Quote Link to comment
crowdx42 Posted April 11, 2017 Share Posted April 11, 2017 Well both of the ones I ordered are DELL HV52W RAID CONTROLLER PERC H310 6GB/S PCI-E 2.0 X8 0HV52W . I am hoping that I still get good speeds with them. The parity check with the Supermicro cards now with INT13 disabled is painfully slow at only around 70MB/sec, with INT13 enabled they were hitting on average around 130MB/sec Quote Link to comment
DoeBoye Posted April 11, 2017 Share Posted April 11, 2017 6 minutes ago, crowdx42 said: Well both of the ones I ordered are DELL HV52W RAID CONTROLLER PERC H310 6GB/S PCI-E 2.0 X8 0HV52W That's the same model I ordered, so hopefully it will work just as well. Using fireball3's script, I had to use the 'b' option when flashing to IT. If you use his script and have to as well, make sure to send him the output txt file Quote Dell HV52W PERC H310 8-Port 6Gb/s SAS /SATA RAID Controller Quote Link to comment
Joseph Posted April 11, 2017 Author Share Posted April 11, 2017 Parity Check Update: 60% complete; 0 errors. Unfortunately, this is all the testing I can do until this weekend. But the pattern is, 5 errors, same sectors every time the unRAID is rebooted. I believe someone suggested that means its hanging on a drive on shut down, then not cleanly shutting down (even though I don't get an unclean shut down error) and I'm inclined to agree. In any case, I'll move the drives off the on board Marvell controller and run tests and keep y'all posted. Thanks everyone for weighing in. Quote Link to comment
Joseph Posted April 11, 2017 Author Share Posted April 11, 2017 1 hour ago, crowdx42 said: ...with INT13 enabled they were hitting on average around 130MB/sec I only get about 125MB/sec (w/parity checks) data moves slower ... I use cache drives, but I'd like to see improvement on the array. Quote Link to comment
crowdx42 Posted April 11, 2017 Share Posted April 11, 2017 Well I have two 1tb SSDs in RAID 1 for day to day loads to the server but it makes me nervous having parity checks running too long Quote Link to comment
EdgarWallace Posted April 12, 2017 Share Posted April 12, 2017 (edited) On 3.4.2017 at 5:04 PM, EdgarWallace said: I have installed a SAS2LP in my Backup Server, disabled VT-d and still having parity check errors. I ordered a Dell Perc H200 today and will report back If that card is going to resolve my issues. Btw. is it safe to run the parity check once the new controller is installed with the "Write corrections to parity" option? I have finally received my Dell Perc H200 and thanks to the HUGHE efforts of @Fireball3 my backup server is running it's first Parity Check. I have unchecked the "Write corrections to parity" option in order to see if the server will stay at 8 errors that came up with the AOC-SAS2LP-MV8 card. If everything's running well I will check again including the "write corrections" options and running a 3rd check finally. Cross fingers..... EDIT: 9 Sync Errors detected at 38,3% Parity Sync. What is making me nervous is this: Pastebin Edited April 13, 2017 by EdgarWallace Quote Link to comment
Joseph Posted April 12, 2017 Author Share Posted April 12, 2017 17 hours ago, crowdx42 said: Well I have two 1tb SSDs in RAID 1 for day to day loads to the server but it makes me nervous having parity checks running too long It takes ~10 - 12 hours to get through a 4TB parity disk check... I can only imagine 8TB must be about twice that! Quote Link to comment
Joseph Posted April 12, 2017 Author Share Posted April 12, 2017 3 hours ago, EdgarWallace said: If everything's running well I will check again including the "write corrections" options and running a 3rd check finally. Cross fingers..... I would just run the parity check and write corrections. If errors are found on the first run then run it again to make sure the issues have been resolved with the new controller. let us know how it goes. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.