SME Storage Posted June 9, 2018 Share Posted June 9, 2018 Dear Unraiders and Lime Tech, As a storage professional am I working with storage on a daily basis. Therefore was I searching for hot spare funtionality at Unraid and did not find it. (Feel free to correct me when I am wrong.) Definitions: Hot Spare: In case of a disk failure a hot spare disk is automatically added to the array and triggers the normal data/parity rebuild. Global Hot Spare: Hot spare disk to be used by the complete array. Local Hot Spare: Hot Spare intended for one single disk pool. Question is does Unraid require hot spare functionality?? Just remember why we use NAS storage? We like to have some level of hardware redundancy for our data. Hot spares can be an addition to the overall package of hardware redundancy and when used raise the redundancy level. There are many different (hardware) redundancy solutions one can add to your Unraid system. It is just a matter of how far you want to take it and how important your data is to you. Slider always moves between cost on one side and highest possible data redundancy on the other side. Benefit which hot spare feature would bring is the fact that the feature is fully automatic from the time a hot spare has been made available. Another benefit is that the time your array runs in degraded mode is reduced. Some posts expressed worry that in case unraid encounters a bad SATA connection a hot spare will kick in. (when available) Exactly what I would want! First priority is the health of the array. Other posts have mentioned the problem of having a hot spare available at disk array's with different disk capacities. Well that is true. It is a bit more work when you use disks with different capacity to come to the correct disk size you need. In this case it is more easy to use disks with same capacity for every drive pool. If hot spare feature is optional one could choose to use this feature or not. A choice to have a global or local hot spare would complete the whole. Would it not be a very peaceful thought when I am at work and I do not have to worry about my unraid system at home in case my Unraid system could have a hot spare available.... Proactivity is the definition of being in control. Cheers, Marcel 1 1 Quote Link to comment
pwm Posted June 10, 2018 Share Posted June 10, 2018 43 minutes ago, SME Storage said: Exactly what I would want! First priority is the health of the array. A traditional RAID has a much larger need for hot spare support because a traditional RAID that loses one disk more than the number of parity disks will suffer a 100% data loss. unRAID doesn't stripe the data - every single data disk has a separate file system. So a unRAID system with one parity drive that loses two disks will lose 1 or 2 data disks (depending on if one of the failed disks was the parity) - the other data disks will continue to supply valid file content. An unRAID system with dual-parity will lose the data from 1, 2 or 3 data disks in case 3 disks fails at the same time. The other disks will continue to supply valid file content. Since a traditional RAID requires you to read back every single file from backup sources if you lose one disk too much, it's quite obvious why the recommendation is to have hot spare support. Parity is about availability (not replacement for backup) and a full restore of all files from the backup is very far from the availability goals. 51 minutes ago, SME Storage said: A choice to have a global or local hot spare would complete the whole. Note that unRAID supports a single parity-protected array. Besides the array, you can use BTRFS mirroring - commonly used to get redundancy for a cache pool - especially if the cache pool is used to store VM. Quote Link to comment
SME Storage Posted June 13, 2018 Author Share Posted June 13, 2018 Thank you very much for your post. At work I manage allmost one petabyte of data. Used raid solution is based on traditional raid (raid-4-DP). The file system is striped over multiple raid arrays. This allows use of large file systems. Each raid array can deal with losing 2 data drives simultaniously before a raid array would go down. Hot spares are a must here. Thank you for your explaination about Unraid and how it works. I could not have put that in better writing. I am a little worried that I have not been able to find the right words to get the message over. Personally I have no doubt in Unraids raid and file system concept. I feel Unraid is doing a splendit job! Double parity based array together with mirrored cache is a good enough solution in most cases. In general has data protection different protection levels. Parity is one, double parity is another one, Mirroring is one and backup is also an important one. Hypothetical speaking in regard to the risk for multi disk failure: It might seem over the top at first but when automatic assignment of a hot spare is possible in case a disk has died on you one would save time and by that reduce the chance for a multi disk failure even further. Not even considering a scenario where a replacement disk is not available and still needs to be purchased before this disk can be added to the array. Hope that I was able to defend my hot spare opinion this time. Regards, Marcel 1 Quote Link to comment
pwm Posted June 14, 2018 Share Posted June 14, 2018 My recommendation is that you consider a backup for all files you really care for, stored on a separate machine. This makes sure the files survives if the PSU of the main server breaks and fries everything. And preferably the backup server stores the backup offline, so a virus or hacker can't erase everything. If you do have backup and lose more disks than the parity can handle, then you can restore just the specific disk(s) that lost the data while still having access to the data of the other data disks. The main trick is to have some form of crontab job that keeps track of which files were stored on which disk, so you don't restores duplicates. This isn't an issue if the backup is made from disk shares but if you backup user shares the backup software will not see which disk that contained the different files. Most of the people we read about on this forum or other RAID forum who fails to recover despite dual-parity have almost always failed the very first rule with a RAID system. They have not been running any supervision where all disk surfaces are regularly verified and where any problems is notified in a way that the owner/administrator will see at least within 24 hours. So they think their system is well even when it is running with one or more disks broken. First several months later when one disk too many fails they log in and notices the catastrophic failure. Or if they notices that one or more disks are emulated and start a rebuild, they find that one or more of the remaining disks have unrecoverable read errors never noticed because they haven't regularly scanned them. So in the end - if you lose three or more disks, the most probable cause is failed supervision. Just failing to notice the issue with the first disk. And then the second. And then the third. And the much smaller but still probable cause is some form of catastrophic event (temperatures, impact, supply voltages, ...) that is likely to have hurt every disk in the machine. There are quite a number of unRAID users who thinks it takes too much time to set up mail notifications. Or thinks that one mail/night from their unRAID system is just irrelevant spam. Quite a number of these users will show up in the support forum when it's too late to protect/recover all of their data. These are also often the people who think parity replaces the need for backup. Quote Link to comment
c3 Posted June 15, 2018 Share Posted June 15, 2018 (edited) You need to decide which factor is your primary concern, data durability (data loss), or data availability. As mentioned backups dramatically improve data durability. But if you are after data availability, you'll need to handle all the hardware factors power supplies (as mentioned), memory (ECC and DIMM fail/sparing), cooling, and probably networking (lacp, etc). SME Storage Some posts expressed worry that in case unraid encounters a bad SATA connection a hot spare will kick in. (when available) Exactly what I would want! First priority is the health of the array. The sparing process can be scripted. As a subject matter expert, and your vast experience, this will be straight forward. Perl and python are available in the Nerd Tools. This may allow you to worry less while working. However, I am not sure it would be "hot" as the array must shutdown to reassign the drive. You could implement NetApp's maintenance garage function, to test, and then resume or fail the drive. Edited June 15, 2018 by c3 Quote Link to comment
SME Storage Posted July 6, 2019 Author Share Posted July 6, 2019 Of course it is possible to add additional functionality yourself by scripting those. On the other hand when you add custom functionality yourself then at some point this is going to work against you == additional maintenance and testing at every new Unraid version. You might like custom modding and that is fine but for now like to address high level design. By example the KISS principle by implementing out of the box solutions. Hot spare functionality might be a game changer in regard to Unraid's design. Have no doubt. An additional protection layer in securing data availability. Like to think that it is only a matter of time for LimeTech to add hot spare functionality. The logical next step in Unraid's evolution. Look around and see what happens in today's home with IOT. We see more and more automation and simplicity been added to our lives. Automation is the way forward. The difference between pro-activity and reactivity. 1 Quote Link to comment
Marshalleq Posted July 14, 2019 Share Posted July 14, 2019 Actually, I was surprised Unraid did not have this time tested feature. I personally think a hot spare capability would still be of benefit in unraid. Even though you only lose one disk of data in unraid, it is actually also about the risk factor of losing another disk. Once one disk is gone, you can rebuild that, but a second no. Liklihood of more than one disk dying simultaneously? Well, more likely the more disks you have. And yes it does happen. One advantage of a hot spare particularly for smaller builds is that you could do away with the negative performance impact of dual parity and still have cover to reduce the risk of a second disk dying, which is heightened once one disk has died, e.g. due to the extra heat from having to constantly calculate the parity of the failed disk. It's a really great feature and unraid is the first redundant system I've ever seen that doesn't have it. 1 Quote Link to comment
miicar Posted January 25, 2023 Share Posted January 25, 2023 Did this thought just die? or has anyone heard that UnRaid might be thinking about doing something like a HotSpare? i would be nice for some of us! Quote Link to comment
schreibman Posted April 2, 2023 Share Posted April 2, 2023 On 6/13/2018 at 5:26 PM, SME Storage said: Thank you for your explaination about Unraid and how it works. I could not have put that in better writing. I am a little worried that I have not been able to find the right words to get the message over. I agree on both points, @pwm's explaination about Unraid and how it works was very nice. However, my tl;dr on this is: If I'm accountable to manage [1+ TB of data] OR [0+ b of critical data] I would deploy a hot spare as routine part of my business continuity plan. Do any docker/apps/user scripts provide Hot Spare functionality at array or cache pool ? Quote Link to comment
VanGogh Posted April 10, 2023 Share Posted April 10, 2023 On 6/10/2018 at 6:26 AM, SME Storage said: Dear Unraiders and Lime Tech, As a storage professional am I working with storage on a daily basis. Therefore was I searching for hot spare funtionality at Unraid and did not find it. (Feel free to correct me when I am wrong.) Definitions: Hot Spare: In case of a disk failure a hot spare disk is automatically added to the array and triggers the normal data/parity rebuild. Global Hot Spare: Hot spare disk to be used by the complete array. Local Hot Spare: Hot Spare intended for one single disk pool. Question is does Unraid require hot spare functionality?? Just remember why we use NAS storage? We like to have some level of hardware redundancy for our data. Hot spares can be an addition to the overall package of hardware redundancy and when used raise the redundancy level. There are many different (hardware) redundancy solutions one can add to your Unraid system. It is just a matter of how far you want to take it and how important your data is to you. Slider always moves between cost on one side and highest possible data redundancy on the other side. Benefit which hot spare feature would bring is the fact that the feature is fully automatic from the time a hot spare has been made available. Another benefit is that the time your array runs in degraded mode is reduced. Some posts expressed worry that in case unraid encounters a bad SATA connection a hot spare will kick in. (when available) Exactly what I would want! First priority is the health of the array. Other posts have mentioned the problem of having a hot spare available at disk array's with different disk capacities. Well that is true. It is a bit more work when you use disks with different capacity to come to the correct disk size you need. In this case it is more easy to use disks with same capacity for every drive pool. If hot spare feature is optional one could choose to use this feature or not. A choice to have a global or local hot spare would complete the whole. Would it not be a very peaceful thought when I am at work and I do not have to worry about my unraid system at home in case my Unraid system could have a hot spare available.... Proactivity is the definition of being in control. Cheers, Marcel Sonic exe Because failing slices from RAID 1 or RAID 5 volumes are automatically replaced and resynchronized by hot spares in the event of a failure, hot spares offer protection against the failure of hardware. Quote Link to comment
JonathanM Posted April 10, 2023 Share Posted April 10, 2023 7 hours ago, VanGogh said: Because failing slices from RAID 1 or RAID 5 volumes are automatically replaced and resynchronized by hot spares in the event of a failure, hot spares offer protection against the failure of hardware. Unraid doesn't use RAID1 or RAID5 for the parity array. Or are you a bot/spammer account? Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.