dougnliz Posted December 8, 2015 Share Posted December 8, 2015 From the posts in this thread it doesn't seem like the tunable setting completely resolves the issue. If replacing the controller will fully fix it then I'm sold. Maybe I'll keep the SAS2LP for an additional controller once this is actually resolved. Doug Quote Link to comment
TUMS Posted December 8, 2015 Share Posted December 8, 2015 v6.1.4 & the tunable setting (md_sync_thresh) should fix parity speed issues. The red ball issues are a totally different problem afaik. Quote Link to comment
BRiT Posted December 8, 2015 Share Posted December 8, 2015 v6.1.4 & the tunable setting (md_sync_thresh) should fix parity speed issues. The red ball issues are a totally different problem afaik. Yes, two different issues. Not everyone suffers from both. Quote Link to comment
dougnliz Posted December 8, 2015 Share Posted December 8, 2015 Yeah and the issue I'm having is the redball one. So just to confirm I do the below steps once I get my H310: Download the files in Dell link in this post: http://lime-technology.com/forum/index.php?topic=12767.msg409244#msg409244 Then perform the steps in this post: http://lime-technology.com/forum/index.php?topic=12767.msg409058#msg409058 Thanks, Doug Quote Link to comment
JorgeB Posted December 8, 2015 Share Posted December 8, 2015 Yeah and the issue I'm having is the redball one. So just to confirm I do the below steps once I get my H310: Download the files in Dell link in this post: http://lime-technology.com/forum/index.php?topic=12767.msg409244#msg409244 Then perform the steps in this post: http://lime-technology.com/forum/index.php?topic=12767.msg409058#msg409058 Thanks, Doug That’s what I used to flash mine. Quote Link to comment
dougnliz Posted December 11, 2015 Share Posted December 11, 2015 Well I got my card, but apparently have no way to flash it. None of my PCs except the UNRAID server have the correct PCI Express slots for this card. I can't seem to get my UNRAID server to boot into DOS. I tried one USB stick and when it boots my screen just says "Invalid Partition". I tried another stick which was actually my other UNRAID stick for a test server and that one won't boot either. I used that particular stick because I know it boots okay from USB. I confirmed both USB sticks boot to DOS on another PC, but that PC allows me to hit F8 for a boot menu and choose the USB. My UNRAID server has the SuperMicro X8SIL which doesn't have a boot menu, but allows you to chose the USB stick as the first boot device. I also tried the H310 in a PC with a PCIe x16 slot, but it wouldn't boot with the card in it. Booted find after taking the H310 out. Anyone have any ideas how I can get this done? Quote Link to comment
dougnliz Posted December 11, 2015 Share Posted December 11, 2015 Scratch that. I got it to boot by removing the power connectors from all my drives. Now the only problem is when I try the first command all I get back is the "Unknown command megarec" error. That file isn't in the files provided in the link. Where does this come from? Doug Quote Link to comment
Frank1940 Posted December 11, 2015 Share Posted December 11, 2015 Scratch that. I got it to boot by removing the power connectors from all my drives. Now the only problem is when I try the first command all I get back is the "Unknown command megarec" error. That file isn't in the files provided in the link. Where does this come from? Doug Try Googling your error message. I got several hits.... Quote Link to comment
dougnliz Posted December 11, 2015 Share Posted December 11, 2015 Ok I got the megarec utility by downloading Fireball's package. Card is successfully flashed! Server booted up and everything looks normal. Now to see if I can get all my drives working correctly again and do a parity check without get a random redball. Thanks everyone for the help. Doug Quote Link to comment
ufopinball Posted December 11, 2015 Share Posted December 11, 2015 Okay, I have received my pair of Dell PERC H310 controllers, flashed the firmware, and completed a parity check. unRAID 5.0.6 with Supermicro AOC-SAS2LP-MV8 controllers: Nov 1 08:42:22 Cortex kernel: md: sync done. time=34871sec (unRAID engine) Nov 1 08:42:22 Cortex kernel: md: recovery thread sync completion status: 0 (unRAID engine) unRAID 6.1.6 with Dell PERC H310 controllers: Dec 11 01:47:10 Cortex kernel: md: sync done. time=32663sec Dec 11 01:47:10 Cortex kernel: md: recovery thread sync completion status: 0 Timing is a bit better than before. Didn't have any issues at all with the drives/controllers. In fact, there's literally nothing in the syslog between the start/end of the parity check except my login to the console. My only issue now is I have a bunch of drives that are flagged for potential SMART failure, due to the "Command Timeout" flag ... a result of the SAS2LP controllers losing track of the drives and thus causing the redball issue. Dunno what if anything can be done about that? The drives are good. At the very least, I can keep watching the syslog for any future errors. PS: If anyone wants to gift me some forward breakout cables, I don't mind holding onto the SAS2LP cards and testing on my alternate server. PM me if you have some spares. Quote Link to comment
JorgeB Posted December 11, 2015 Share Posted December 11, 2015 My only issue now is I have a bunch of drives that are flagged for potential SMART failure, due to the "Command Timeout" flag ... a result of the SAS2LP controllers losing track of the drives and thus causing the redball issue. Dunno what if anything can be done about that? The drives are good. At the very least, I can keep watching the syslog for any future errors. Most believe there’s no point of monitoring that attribute for Seagates, you can disable it on each Seagate disk or globally on global SMART settings. Quote Link to comment
ufopinball Posted December 11, 2015 Share Posted December 11, 2015 My only issue now is I have a bunch of drives that are flagged for potential SMART failure, due to the "Command Timeout" flag ... a result of the SAS2LP controllers losing track of the drives and thus causing the redball issue. Dunno what if anything can be done about that? The drives are good. At the very least, I can keep watching the syslog for any future errors. Most believe there’s no point of monitoring that attribute for Seagates, you can disable it on each Seagate disk or globally on global SMART settings. I did notice this only affected my Seagate drives ... I made the change and it solved my problem. Thanks for the tip! Quote Link to comment
garycase Posted December 11, 2015 Share Posted December 11, 2015 ... you can disable it on each Seagate disk or globally on global SMART settings. Where is this change made? I don't see the option to do it (on 6.1.3 -- was it added after that?) Quote Link to comment
JorgeB Posted December 11, 2015 Share Posted December 11, 2015 ... you can disable it on each Seagate disk or globally on global SMART settings. Where is this change made? I don't see the option to do it (on 6.1.3 -- was it added after that?) It was added on 6.1.5 but there was a bug, it works great on 6.1.6 Quote Link to comment
dougnliz Posted December 12, 2015 Share Posted December 12, 2015 Ok I got the megarec utility by downloading Fireball's package. Card is successfully flashed! Server booted up and everything looks normal. Now to see if I can get all my drives working correctly again and do a parity check without get a random redball. Thanks everyone for the help. Doug Well I'm able to rebuild drives and do parity checks now without getting read errors on other drives. So that's good. My last parity check average looks like it was around 90 MB/s. Unfortunately now I'm dealing with file system corruption issues, most likely because of all the rebuilds I tried to do when the SAS2LP was causing the redball errors. The disks with corruption are the same ones that were redballing. Whatever the bug is with that card is pretty ugly in v6. I never had trouble with v5 and now v6 has been nothing but headaches. Hopefully after getting this last drive file system check done I can get back to a stable environment I can just enjoy again. Doug Quote Link to comment
BrianAz Posted December 12, 2015 Share Posted December 12, 2015 I'm still experiencing a system hang every ~ 2-4 days w/ my 2xSAS2LP and RFS drives triggered during writes. My system load will start climbing and never stop (gets well into triple digits)... shares become unresponsive and eventually all I can do is telnet in. However, Powerdown does't work, nor can I stop the array or unmount drives manually. There is absolutely nothing in the logs that I can see when this happens. In the past, when I would run the mover (usually nightly) the problem would surface after a few days while the mover was running and I'd have to hard-boot to recover. Likewise, when the cache drive was disabled, the problem would happen every 2-4 days (all my data drives are RFS and on SAS2LP, Parity is on motherboard). My 500GB cache drive is XFS and uses a motherboard SATA connector. I've changed it to run the mover only monthly and have been going 10 days now writing to it daily via Sonarr and CP without issue (without running mover). My plan is to run through the holidays like this without running the mover and if it doesn't hang I think I can safely assume its a combination of the SAS2LP+RFS (and maybe something else specific to my system??) triggering the issue. Someone mentioned to me that moving all their drives from RFS to XFS also corrected this issue for them, but honestly that'll take a LONG time. I have acquired 2xm1015s to install after the new year. I guess if my issue persists with the new controllers, I'll have to decide if I should move ahead with the RFS->XFS conversion or go back to unRAID v5.0.5 which I rarely had to think about. I considered moving to one of the newer unRAID versions now but I don't want to interrupt my test and I don't see anyone saying it resolved this particular issue for them. Hopefully I'll not have any crashes while writing solely to the cache and things will be back to the level of stability I enjoyed in 5.0.5 again after I move to m1015s. Quote Link to comment
dougnliz Posted December 12, 2015 Share Posted December 12, 2015 A system hang like you describe is how I found this last round of file system corruption. I was doing a parity check and near the end the webgui became unresponsive. I could telnet in still so I took a quick look at the syslog and at the end of it I saw the file system errors telling me to run reiserfsck. I tried to reboot from the command prompt but even that hung and I had to manually power off the system. After rebooting I ran the reiserfsck on the disk that was throwing the error and I needed to use the --rebuild-tree option. That will still be running for several hours, so while I wait for that I'm checking the other drives in another session just to be sure I don't have more. Good luck resolving your issue. After all the issues I've had I'd recommend replacing those controllers sooner rather than later, especially since you have them in hand. I've lost some of my data due to these issues. Luckily nothing important, but still a PITA. Quote Link to comment
BrianAz Posted December 12, 2015 Share Posted December 12, 2015 Thanks. I have never had a problem with parity checks (aside from them being super slow before LT fixed that) or red-ball but decided to run reiserfsck on each drive after the last hang. Everything came back fine so thankfully I don't seem to be impacted by the corruption issue. I agree that though LT has been fixing things as they can, this controller just seems to cause trouble. I'm done with them. As soon as I get back from my holiday trip, they're coming out and going on eBay. Quote Link to comment
garycase Posted December 13, 2015 Share Posted December 13, 2015 ... I'm done with them. As soon as I get back from my holiday trip, they're coming out and going on eBay. Easy to understand that sentiment. I have no idea what happened in v6 that has made these so much of a hassle, but I agree I'd certainly not buy one now. Not very long ago they'd have been my first choice for a new 8-port card ... but that's certainly no longer the case. What's difficult to understand is why they work perfectly in Windows machines or with v5 of UnRAID ... and, for that matter, on SOME folks v6 systems, but not others. [i suspect if we had more complete data we'd find some relationship between the system that are having problems and a particular chipset (or chipsets)] Quote Link to comment
Whaler_99 Posted December 13, 2015 Share Posted December 13, 2015 Something interesting with these cards. I have three friends all using them in new builds. New as in v6 builds. None of them had a v5 machine and all three systems are working fine. Quote Link to comment
BrianAz Posted December 13, 2015 Share Posted December 13, 2015 Something interesting with these cards. I have three friends all using them in new builds. New as in v6 builds. None of them had a v5 machine and all three systems are working fine. Are they all using XFS (or BTRFS)? I suspect its localized to SAS2LP-MV8 cards + ReiserFS + Writing operations + (possibly) some additional factor in my setup. Quote Link to comment
bkastner Posted December 14, 2015 Share Posted December 14, 2015 Something interesting with these cards. I have three friends all using them in new builds. New as in v6 builds. None of them had a v5 machine and all three systems are working fine. Are they all using XFS (or BTRFS)? I suspect its localized to SAS2LP-MV8 cards + ReiserFS + Writing operations + (possibly) some additional factor in my setup. I suffered from these issues as well, and had converted my entire server to XFS. Quote Link to comment
BrianAz Posted December 14, 2015 Share Posted December 14, 2015 Thanks . Glad I'm getting rid of these controllers. Seem to be a new challenge to overcome at every turn. Quote Link to comment
Whaler_99 Posted December 14, 2015 Share Posted December 14, 2015 Something interesting with these cards. I have three friends all using them in new builds. New as in v6 builds. None of them had a v5 machine and all three systems are working fine. Are they all using XFS (or BTRFS)? I suspect its localized to SAS2LP-MV8 cards + ReiserFS + Writing operations + (possibly) some additional factor in my setup. I will ask and find out... Quote Link to comment
ufopinball Posted December 14, 2015 Share Posted December 14, 2015 Something interesting with these cards. I have three friends all using them in new builds. New as in v6 builds. None of them had a v5 machine and all three systems are working fine. Are they all using XFS (or BTRFS)? I suspect its localized to SAS2LP-MV8 cards + ReiserFS + Writing operations + (possibly) some additional factor in my setup. Interestingly enough, the problem with SAS2LP (at least in my case) seemed to come during a Parity Check. Isn't that a read-only operation at the sector level, and is therefore considered file-system agnostic? Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.