garycase Posted November 4, 2015 Share Posted November 4, 2015 Does the nr_requests script need to be added to the Go file while we wait for the next update? I boot very rarely, so it's easy enough to just manually do the updates when I do that. I'm optimistic that this issue will be resolved a future release (hopefully the next one) by either changing the default, or by providing a disk "tunable" that allows the user to set the value. Quote Link to comment
jonp Posted November 4, 2015 Share Posted November 4, 2015 I am testing a build right now that has nr requests as a tunable setting in the webgui (amongst many many other changes) Quote Link to comment
TODDLT Posted November 4, 2015 Author Share Posted November 4, 2015 I couldn't help notice you mention this a couple times and somehow missed it up until tonight. What do we know about this issue? Do we know what the problem configurations are? Is there someway to figure out if you have this issue before it causes a data issue, or is there no way to know until it's too late? If there is some way to "know" if i am safe or not, I'll try the SAS2LP again this weekend and see what happens. Certainly, we can't ask anyone to test something that could cause data issues, so ideally there's someone who can try it on a test system. Or at least on a system with EVERYTHING backed up elsewhere. If you do decide to test, we'll be grateful! But it IS YOUR data, not ours we're risking! On the other hand, if it has already caused parity issues, where you can't trust the current validity of your parity drive, then you're already in trouble, and have less to lose. You NEED a parity drive you have confidence in, so this testing has clear benefits over the risks for you. (not sure how clearly I worded that) I have run the SAS2LP in my server before. Slow speeds but no errors. I'm confident in my parity as it stands right now. So does not having seen a parity check error occur in one parity check on this server mean I'm safe (as it relates to this specific error)? Is this the type of problem that either you have always due to hardware configuration or never have? Or should I continue to be concerned? Quote Link to comment
garycase Posted November 4, 2015 Share Posted November 4, 2015 I am testing a build right now that has nr requests as a tunable setting in the webgui (amongst many many other changes) Excellent. Looking forward to it. Quote Link to comment
garycase Posted November 4, 2015 Share Posted November 4, 2015 I am testing a build right now that has nr requests as a tunable setting in the webgui (amongst many many other changes) BTW, this would be a VERY useful feature if there's any chance of getting it into the next release: http://lime-technology.com/forum/index.php?topic=43765.0 ... it would have allowed two cases I'm aware of (and probably more) to be easily resolved instead of resulting in lost data. Quote Link to comment
RobJ Posted November 4, 2015 Share Posted November 4, 2015 I couldn't help notice you mention this a couple times and somehow missed it up until tonight. What do we know about this issue? Do we know what the problem configurations are? Is there someway to figure out if you have this issue before it causes a data issue, or is there no way to know until it's too late? If there is some way to "know" if i am safe or not, I'll try the SAS2LP again this weekend and see what happens. Certainly, we can't ask anyone to test something that could cause data issues, so ideally there's someone who can try it on a test system. Or at least on a system with EVERYTHING backed up elsewhere. If you do decide to test, we'll be grateful! But it IS YOUR data, not ours we're risking! On the other hand, if it has already caused parity issues, where you can't trust the current validity of your parity drive, then you're already in trouble, and have less to lose. You NEED a parity drive you have confidence in, so this testing has clear benefits over the risks for you. (not sure how clearly I worded that) I have run the SAS2LP in my server before. Slow speeds but no errors. I'm confident in my parity as it stands right now. So does not having seen a parity check error occur in one parity check on this server mean I'm safe (as it relates to this specific error)? Is this the type of problem that either you have always due to hardware configuration or never have? Or should I continue to be concerned? I don't think anyone here can definitively say there's no risk, or how big a risk, because I doubt anyone here has seen the code, examined it and figured out what's wrong, and that's what it would take to be sure of what's risky and what's not. But if all you have experienced is parity check slowness, then there are no reports or even hints of problems with that. My concern and cautions were more for those with a SAS2LP that has caused more serious issues, either parity errors or possibly even data loss. Quote Link to comment
JorgeB Posted November 4, 2015 Share Posted November 4, 2015 Dumb question... Do we think this setting will improve parity check speeds with the original SASLP card? (and V6) No idea. Looking at SuperMicro's specs, it IS based on a Marvell chip (Marvell 6480), but I don't know if that has the same issues as the 9480 chip used in the SAS2LP. Very easy to test => just do a parity check (Just run it for perhaps 10 minutes and update the status); then change the nr_requests values and repeat the process. Ok.. With the default.. I was at 90-95MB/s With the change I was at ~102-106MB/s So there is a speed up for the SASLP original card as well. I guess I put this in the go script? Jim Just out of curiosity and to see how your speed compares with my own testing, how many disks are on the SASLP? I’m guessing no more than 6? Quote Link to comment
jbuszkie Posted November 4, 2015 Share Posted November 4, 2015 Just out of curiosity and to see how your speed compares with my own testing, how many disks are on the SASLP? I’m guessing no more than 6? I have 8... but 2 are not in the array .. in fact.. I may have been less than 6 as I'm not sure where my cache disks (2) are located! Jim Quote Link to comment
jbuszkie Posted November 4, 2015 Share Posted November 4, 2015 Just out of curiosity and to see how your speed compares with my own testing, how many disks are on the SASLP? I’m guessing no more than 6? I have 8... but 2 are not in the array .. in fact.. I may have been less than 6 as I'm not sure where my cache disks (2) are located! Jim If I trust the odering SDa- SDx... I had 5 disks in the parity check on the SASLP controller. (11 total in the array including parity) Quote Link to comment
JorgeB Posted November 4, 2015 Share Posted November 4, 2015 Just out of curiosity and to see how your speed compares with my own testing, how many disks are on the SASLP? I’m guessing no more than 6? I have 8... but 2 are not in the array .. in fact.. I may have been less than 6 as I'm not sure where my cache disks (2) are located! Jim If I trust the odering SDa- SDx... I had 5 disks in the parity check on the SASLP controller. (11 total in the array including parity) I expected 6 arrays disks as your speed is very close to what I got from my tests with 6 SSDs: default - 93.1MB/s nr_r.=8 - 105MB/s In any case your improvement is in line with what I get with 5 to 8 disks, between 9 and 12%, so I believe most SASLP users will also benefit from this tweak, naturally only if there aren't any other bottlenecks. Quote Link to comment
jonp Posted November 4, 2015 Share Posted November 4, 2015 So another update that I think everyone will be happy to hear. Tom wasn't happy with having to resort to the nr requests tweak to solve this issue. Instead, he has been examining and tweaking driver code to see if he could make improvements from that side. I'm pleased to report that in testing today, we were able to get back to pretty much full v5 performance on our latest build with this controller and without any tweaking to the nr requests item. We still will have nr requests as a tunable setting under disk settings, but we are hopeful that most won't even have to resort to that with 6.2. Quote Link to comment
tdallen Posted November 5, 2015 Share Posted November 5, 2015 That's great news, Jon! I thought I would report my nr_requests results. The array is a 6TB parity, 6TB data, and 3 3TB data drives. I went from a parity check time of just under 26 hours averaging 64.6 MB/s to 15.5 hours at an average of 107.5 MB/s. I don't have good CPU utilization figures but it appears to have gone up modestly - which I suppose makes sense since the CPU is now working harder rather than sitting in wait states. All in all a huge improvement! I'm one of the people who had consistently reproducible false red balls during parity checks. I implemented a bunch of workarounds to get past that. Hopefully between the nr_requests change and/or the driver tweaks Tom is working on that issue will be addressed as well. Quote Link to comment
TODDLT Posted November 5, 2015 Author Share Posted November 5, 2015 I don't think anyone here can definitively say there's no risk, or how big a risk, because I doubt anyone here has seen the code, examined it and figured out what's wrong, and that's what it would take to be sure of what's risky and what's not. But if all you have experienced is parity check slowness, then there are no reports or even hints of problems with that. My concern and cautions were more for those with a SAS2LP that has caused more serious issues, either parity errors or possibly even data loss. What I was trying to figure out is if you heard this being a sleeper. IE everything working fine, no issues and then one day you are getting problems, or if its one of those things that would show up as soon as i tried to do a parity check? Quote Link to comment
JorgeB Posted November 5, 2015 Share Posted November 5, 2015 So another update that I think everyone will be happy to hear. Tom wasn't happy with having to resort to the nr requests tweak to solve this issue. Instead, he has been examining and tweaking driver code to see if he could make improvements from that side. I'm pleased to report that in testing today, we were able to get back to pretty much full v5 performance on our latest build with this controller and without any tweaking to the nr requests item. Although the tweak fixed the issue for me (and from reading this thread appears to work for everyone else with the SAS2LP) I’m very happy that Tom has fixed the underlying issue, I believe that the SAS2LP performance has degraded further with almost every new kernel release since V5beta12, so there was a risk that the issue could reemerge in the future. I am testing a build right now that has nr requests as a tunable setting in the webgui (amongst many many other changes) We still will have nr requests as a tunable setting under disk settings, but we are hopeful that most won't even have to resort to that with 6.2. Hmmm, I wonder if that means I’ll have to buy a new disk for my servers soon… Quote Link to comment
gubbgnutten Posted November 5, 2015 Share Posted November 5, 2015 So another update that I think everyone will be happy to hear. Tom wasn't happy with having to resort to the nr requests tweak to solve this issue. Instead, he has been examining and tweaking driver code to see if he could make improvements from that side. I'm pleased to report that in testing today, we were able to get back to pretty much full v5 performance on our latest build with this controller and without any tweaking to the nr requests item. We still will have nr requests as a tunable setting under disk settings, but we are hopeful that most won't even have to resort to that with 6.2. Great news! Just curious, are the tweaks specific for SAS2LP, or will for example other Marvell users benefit? Quote Link to comment
garycase Posted November 5, 2015 Share Posted November 5, 2015 So another update that I think everyone will be happy to hear. Tom wasn't happy with having to resort to the nr requests tweak to solve this issue. Instead, he has been examining and tweaking driver code to see if he could make improvements from that side. I'm pleased to report that in testing today, we were able to get back to pretty much full v5 performance on our latest build with this controller and without any tweaking to the nr requests item. We still will have nr requests as a tunable setting under disk settings, but we are hopeful that most won't even have to resort to that with 6.2. Excellent news. Is this a general update to Marvell-based drivers, or is it specific to the SAS2LP? To be specific for my case (and, I suspect many others, since Tom used to use these cards a lot) will it help with 1430SA's? Glad to see the nr_requests is still going to be a tunable "just in case" ... but it'd definitely be nice if this wasn't needed. I hesitate to ask ... but curiosity demands that I do [ ] => any idea of the likely timeline for this update ? Quote Link to comment
garycase Posted November 5, 2015 Share Posted November 5, 2015 Soon. I assume the same thing ... but "soon" in "UnRAID speak" has historically had a VERY wide range of meanings Quote Link to comment
jonp Posted November 5, 2015 Share Posted November 5, 2015 So another update that I think everyone will be happy to hear. Tom wasn't happy with having to resort to the nr requests tweak to solve this issue. Instead, he has been examining and tweaking driver code to see if he could make improvements from that side. I'm pleased to report that in testing today, we were able to get back to pretty much full v5 performance on our latest build with this controller and without any tweaking to the nr requests item. We still will have nr requests as a tunable setting under disk settings, but we are hopeful that most won't even have to resort to that with 6.2. Excellent news. Is this a general update to Marvell-based drivers, or is it specific to the SAS2LP? To be specific for my case (and, I suspect many others, since Tom used to use these cards a lot) will it help with 1430SA's? Glad to see the nr_requests is still going to be a tunable "just in case" ... but it'd definitely be nice if this wasn't needed. I hesitate to ask ... but curiosity demands that I do [ ] => any idea of the likely timeline for this update ? It was an update to the unRAID driver, not the Marvell controller driver. As far as release timeframe, I think we have been doing a pretty good job with follow up releases since 6 was pushed out (already at 6.1.3, actively working on 6.2 now). So when I say "soon" here, take it in that context. Quote Link to comment
garycase Posted November 5, 2015 Share Posted November 5, 2015 Agree => both the releases and the communications have been MUCH better in the past year or so. Quote Link to comment
wgstarks Posted November 5, 2015 Share Posted November 5, 2015 Is it "Soon" yet? Quote Link to comment
jonp Posted November 5, 2015 Share Posted November 5, 2015 We "may" push out a 6.1.4 if 6.2 internal testing takes too much longer. I'm hoping it doesn't come to that though. 6.2 is looking mighty sweet. Bunch of features packed in that NO ONE saw coming. ;-). Sorry to tease. Quote Link to comment
Kir Posted November 5, 2015 Share Posted November 5, 2015 I wonder if it's going to help those of us who had crashes with SAS2LP doing parity check? Quote Link to comment
mr-hexen Posted November 5, 2015 Share Posted November 5, 2015 Bunch of features packed in that NO ONE saw coming. ;-). Sorry to tease. Damn You!! Quote Link to comment
RobJ Posted November 5, 2015 Share Posted November 5, 2015 I don't think anyone here can definitively say there's no risk, or how big a risk, because I doubt anyone here has seen the code, examined it and figured out what's wrong, and that's what it would take to be sure of what's risky and what's not. But if all you have experienced is parity check slowness, then there are no reports or even hints of problems with that. My concern and cautions were more for those with a SAS2LP that has caused more serious issues, either parity errors or possibly even data loss. What I was trying to figure out is if you heard this being a sleeper. IE everything working fine, no issues and then one day you are getting problems, or if its one of those things that would show up as soon as i tried to do a parity check? From all reports, there's nothing 'sleeper' about this, problems show up fairly quickly. The problems may be different on different systems, but is consistent within any given system. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.