January 29, 201511 yr When will unRaid achieve this ;-) http://hardware.slashdot.org/story/15/01/29/1528208/proposed-disk-array-with-99999-availablity-for-4-years-sans-maintenance Ok seriously an interesting article.
January 29, 201511 yr Even more interesting is the paper. http://arxiv.org/ftp/arxiv/papers/1501/1501.00513.pdf If you have a lot of data, even unlimited spares will not protect you. Still think single parity is good?
January 29, 201511 yr Five 9's is a VERY ambitious target, as you can easily see from these articles. I've worked in organizations where the goal was 4 9's, and we spent many millions to achieve that level of reliability. Technology has evolved quite a bit since then, so it's a bit easier to add that extra 9 -- but it's still clearly an expensive proposition. For personal use, I'd settle for dual parity in UnRAID
January 29, 201511 yr Author Even more interesting is the paper. http://arxiv.org/ftp/arxiv/papers/1501/1501.00513.pdf If you have a lot of data, even unlimited spares will not protect you. Still think single parity is good? yeah that's what I really meant. I didn't want to direct link to a .pdf Well if you have a lot of data it means multiple independent arrays, not just a bigger array. As for single parity [shrug] I mean I'm not sure we need 99.999% uptime with zero human intervention. Most of us have the luxury of taking down our arrays (ignoring lack of hot-swap mandating a shutdown) to deal with failures. In fact I'd say we all do because we aren't losing $millions a minute.
January 29, 201511 yr While you can take an outage, for me it is more about data loss events. Which the paper shows are going to happen with even RAID6 and unlimited spares. See Table II.
January 30, 201511 yr When will unRaid achieve this ;-) They'll be able to update you later about that soon.
January 30, 201511 yr Author While you can take an outage, for me it is more about data loss events. Which the paper shows are going to happen with even RAID6 and unlimited spares. See Table II. Data loss is always a risk, even at 99.999% But their main point / goal is what would it take to get that AND do it with no intervention. If you are willing to accept human intervention your reliability and data integrity is much easier (structurally) to obtain especially when you are dealing with smaller arrays. And a willingness to take the array down means your spare failure rate can be assumed to be lower which will make quite a difference. Mind you I'm in no way intending to argue against the need for dual parity. The article makes that clear enough ... as if we didn't already know it anyway
January 30, 201511 yr While you can take an outage, for me it is more about data loss events. Which the paper shows are going to happen with even RAID6 and unlimited spares. See Table II. Data loss is always a risk, even at 99.999% But their main point / goal is what would it take to get that AND do it with no intervention. If you are willing to accept human intervention your reliability and data integrity is much easier (structurally) to obtain especially when you are dealing with smaller arrays. And a willingness to take the array down means your spare failure rate can be assumed to be lower which will make quite a difference. Mind you I'm in no way intending to argue against the need for dual parity. The article makes that clear enough ... as if we didn't already know it anyway Are we reading the same material? Here is the First paragraph: "Abstract —As the prices of magnetic storage continue to decrease, the cost of replacing failed disks becomes increasingly dominated by the cost of the service call itself. We propose to eliminate these calls by building disk arrays that contain enough spare disks to operate without any human intervention during their whole lifetime. To evaluate the feasibility of this approach, we have simulated the behavior of two-dimensional disk arrays with n parity disks and n(n–1)/2 data disks under realistic failure and repair assumptions. Our conclusion is that having n(n+ 1)/2 spare disks is more than enough to achieve a 99.999 percent probability of not losing data over four years. We observe that the same objectives cannot be reached with RAID level 6 organizations and would require RAID stripes that could tolerate triple disk failures." Human intervention only makes things worse. My days are filled with the affect of human intervention on storage systems. The data in table II shows, no number of spares will reach the goal, so spare failure rate has nothing to do with it.
January 30, 201511 yr What alex is trying to say is that with a repair rate > 0 the goal would be easier to achieve.
February 12, 201511 yr I don't see why unRAID can't do this. I have an ancient version of FreeNAS running on an old HP Microserver at work, and it's over 4 years old. Last uptime was 370 days, and that was only defeated by the UPS dying.
February 12, 201511 yr Author Any one person, myself included, can anecdotally observe amazingly high reliability. But as a matter of practice, to design a system that is statistically likely to have an expected reliability of 99.999% reliability is very difficult as the study shows as does practical experience just reading this forum or knowing "the biz". UnRaid is no where even close to achieving such a thing in so much as it would be legally and morally suspect for them to claim it. But that is OK because they aren't claiming it and we aren't using it in the hopes that it does it anyway. If you NEED that level of reliability then you will need to pay for it, and it will be worth it, because at that point down time is measured in $Millions lost per unit time if not more and you're not taking the risk on a system with a single, non-hot-swappable parity drive.
February 12, 201511 yr High reliability systems are more focused on uptime than specifically on data loss. No matter how many failures a system can tolerate, there still needs to be a solid backup strategy to avoid loss of data. Clearly data integrity is also important ... but data can be restored from backups; but you can't transact business if the system is down [whether the business is banking, stock market trading, a managing a critical medical procedure, or managing a key strategic asset].
February 17, 201511 yr Systems with 99,999% uptime regarding dataloss with 0 maintenance have existed a long time. Books and cave paintings. Songs and anekdotes have shown to have some dataloss, but distribution were fast and cheap Anyway, as far as I understand Unraid do not protect against dataloss, it protects against fileloss. /René
Archived
This topic is now archived and is closed to further replies.