Keep certain VM's running without array started


Recommended Posts

  • 2 weeks later...

This, I have to run unraid as a VM on esxi for this exact constraint! I could run bare metal if the pools were separated out and could be managed independently (eg a setting to force vms and docker related contents to a specific cache drive that can be independently managed from other ‘pools’).

Understandably for docker too!

I thought about using my second unraid license as a separate vm on esxi to run just dockers (because I love the ui and App Store), but this constraint on being tied to an unraid array prevents running unraid for just docker - unless I missed something!?

Please find a way to make this happen team!


Sent from my iPhone using Tapatalk

Link to comment

+1. Virtualizing a firewall makes using unraid as the bare metal OS for an all-in-one server a problem. Consider: I have an encrypted array. If I have an extended power failure and UPS goes flat, I can't get back into the array without physically being present to enter the key and start it up again.

I don't want to have to virtualize unraid itself to achieve this. :(

Link to comment

+1 from me as well. I'd like to try to help justify the efforts with a quick list of use cases:

  • Firewall/Router - As others have noted, many of us run pfSense, OPNsense, ipFire, VyOS, etc. Virtualizing the router makes some sense when running an unpaid server given we're running many applications whose sole function relies heavily upon networking, as the horsepower necessary to properly manage/monitor that traffic (traffic shaping/policing, threat detection and deep packet inspection, and so on). When trying to make the most efficient use of resources, having a separate physical machine for this isn't as cheap as buying an SBC like a pi/odroid/whatever, but can quickly add up to hundreds of dollars (not to mention the electricity efficiency lost by not having it in the same box)
  • Auth / Domain Controllers - For self-rosters and businesses alike, LDAP, DNS, TOTP, and others are often needed. I currently run mine in a VPS, as I can't have users ability to authenticate lost simply because I need to replace or add a drive.
  • Home Automation - While many choose to use docker containers for this, others use Home Assistant OS. Having all your automation go down every time you have to take down the array is a significant annoyance. As time goes on and home automation becomes more and more mainstream, I can see a time in the not-to-distant future where it's not abnormal to have access to your home (integrated door locks) are considered 'normal'. The impact of completely losing access to your home automation's functionality will grow as well.
  • Mail server - I doubt many of us are running our own mail servers, but I know at least a few are. I'm willing to bet that those who are, are also running unraid as a vm themselves under proxmox/vmware/(etc), because this is something you absolutely *can't* have go down, especially for something as simple as adding storage.

 

I'm sure there are others I'm missing, but these are the big ones. Once ZFS is Integrated, it'd be great to get some attention to this - I understand there's some complexities here, so I'll attempt to address some of them where I can with some spitballed ideas:

  • Libvirt lives on the array; how do we ensure it's availability?
    1. Make the system share have a pool only (cache, but could be one of any number of pools) assignment requirement ('Offline VM Access - Checkbox here') in order to enable this feature. (think this is the easier method), or -
    2. Dedicated 'offline' libvirt/xml storage location - this seems like it'd be more cumbersome as well as less efficient given the need for dedicated devices
  • The array acts as one single storage pool to all applications; how can we ensure the VM's resources will be available if we stop the array?
    1. Unassigned Devices - As @JonathanM noted, should limetech take over UD, it could be used to set this completely outside the active array, or
    2. Split the 'stop' function of the array - In conjunction with option 1 above, adding a popup when selected stating that 'capacity available for VMs will be limited to the total capacity of the assigned cache pool' or something.. but instead of a one button 'it's all down now', create two separate stop functions
      • One for stopping the array - A notification pops up warning that all docker containers will be taken offline, similar to current popup.
      • Then another for shutting down both the array and pools - Popup again notes impact

 

Idk how useful any of these musings are, but it's been on my mind for a while so I figured what the heck lol. Anyway, BIG +1 for this feature request!

Link to comment
4 hours ago, BVD said:

Split the 'stop' function of the array -

This is the crux of this whole request thread, and I'm finding it hard to see the use cases.

 

What functions do you want to accomplish that require only partially stopping the array? Is this exclusively about reconfiguring storage pools? How often does that happen without the accompanying need to power cycle while rearranging hardware? I doubt that reconfiguring storage pools is going to be able to be done live, at least in the current incarnation of the parity array.

 

Maybe focus on what tasks you want to accomplish without taking down everything. The more compelling and usable for the mass market use case, the better.

 

My arrays stay running for months on end, typically only downed by power or major security update, so I'm not personally seeing the big attraction to partially stopping something that's running basically 24/7/365 anyway, which is what the first half of your post basically describes, what I'm doing currently.

 

I guess what I'm saying is, you described all the reasons my Unraid servers stay running 24/7, but didn't address what you want to do differently than what is currently happening. Why do you want to NOT run 24/7?

Link to comment

The broader use case here is being able to replace a failed drive without taking down everything, but instead minimizing the impact strictly to those components necessary to do so imo.

 

Anyone that's self-hosting, whether for business or friends and family use, loathes downtime as it's a huge pain to try to schedule it at a time that's convenient for all. There are currently 4 family businesses as well as 7 friends that rely upon my servers as I've migrated them away from the google platform, and given even that small number, there are always folks accessing share links, updating their grocery lists, etc. All that stuff is on the pools.

 

For the home nas user with no other individuals accessing their systems for anything, I could see how it wouldnt really matter. But I feel like it's not uncommon for an unraid user to have their system set up in such a way that taking everything down is necessary.

 

As far as reboots, that's a separate thing imo - it could also be addressed in the UI by allowing a samba restart button in the even of an edit to SMB-extra, allowing function level reset to be executed from the UI instead of cli for VFIO binds, and so on. Most of these things can be done without reboots on more modern hardware, it's just not yet available in the UI.

 

To me, this is a big logical step towards getting making things smoother for the user.

Link to comment

Hard drives fail I guess is the gist of it - it's an expected occurrence whose frequency increases with the number of drives you have attached. As we know hard drives die, it just makes sense to me to minimize the impact of addressing those failures when they do occur (taking everything down to address an inevitable eventuality doesn't seem in-line with the idea of a NAS, at least to me anyway)

  • Like 1
Link to comment
On 9/11/2021 at 5:28 AM, JonathanM said:

didn't address what you want to do differently than what is currently happening.

Imagine if ZFS or BTRFS designed their solution so that if you took one pool offline they all went offline?

 

my requirement is to be able to independently start and stop a pool, be it the parity array (or plural if multiple arrays arrive in the future) or a cache pool. Only the services using that particular array/pool to be impacted. A warning to be presented listing which services are impacted, e.g. ‘Specific share 3’ is set to use ‘cache pool 5’ it will be remounted without cache, OR all dockers will be taken offline due to system image being stored on this pool OR this list of VMs will be hibernated as their disk images are stored on this pool, others will continue to run. And so on

 

design considerations - I’m trying hard to not tell you how to design the solution, as I hate it when business teams at work focus on the solution and not the requirement 🙂

 

thx!!!

Link to comment
7 hours ago, johner said:

Imagine if ZFS or BTRFS designed their solution so that if you took one pool offline they all went offline?

The major difference, and what makes this much harder, is that Unraid merges all the pools with the user share file system. It's one thing to have isolated pools operating independently, totally another to have a fuse filesystem suddenly lose access to only part of the files it contains.

 

Changing how the user share filesystem works in Unraid is not a minor thing, and much work has gone into seamlessly merging the pools so the end user doesn't need to worry about which disk or pool really holds the file, it's always presented at the same location, and the share allocation and mover settings determine which pool accepts new writes and where the file ends up after a mover operation.

 

7 hours ago, johner said:

design considerations - I’m trying hard to not tell you how to design the solution,

Please, by all means lay out how you would design it, given the constraints already in place with multiple pools all interacting in a single filesystem. I'm not at all involved with programming Unraid, I'm just trying to get a workable set of plans in place to hopefully request small enough bites to the overall design so that eventually we can get to where we want to go. The feature request as asked, "certain VM's running without the array started" is probably not going to happen without more basic asks being implemented, so if we can distill down what is required that will go a long way to the end goal.

Link to comment
  • 5 weeks later...

+1 from me

 

On 9/12/2021 at 10:38 AM, JonathanM said:

Please, by all means lay out how you would design it, given the constraints already in place with multiple pools all interacting in a single filesystem.

If we were to get multiple arrays (and that's a big if), how would it be implemented? Personally, I don't see a use case for multiple arrays where you are going to have files and directories spread across them. Therefore, arrays will be independant of each other. Given that is the case, taking individual arrays offline shouldn't be a problem. So, I guess this request boils down to having multiple independant arrays?

Ofc, I might be fundamentally misunderstanding how unRaid works. 😅

 

Link to comment
On 10/16/2021 at 3:12 AM, SvbZ3r0 said:

+1 from me

 

If we were to get multiple arrays (and that's a big if), how would it be implemented? Personally, I don't see a use case for multiple arrays where you are going to have files and directories spread across them. Therefore, arrays will be independant of each other. Given that is the case, taking individual arrays offline shouldn't be a problem. So, I guess this request boils down to having multiple independant arrays?

Ofc, I might be fundamentally misunderstanding how unRaid works. 😅

 

 

There are folks who use unraid for massive archival tasks where tape isnt really an option, such as the web archive, where multiple arrays would be of significant benefit. I dont think it's a big percentage of users or anything, but they're power users for sure, and often have 100+ drives, some running one unraid instance as the main hypervisor, then running nested unraid vm instances to allow for multiple arrays beneath. 

 

Definitely not a common scenario... but one way we could see a benefit is by being able to split our usage across multiple arrays, we'd strongly reduce the possibility of a full failure during reconstruction - I dont relish the idea of having more than, say, 12 drives or so in any type of double RAID array (regardless of zfs, unraid, lvm/MD, etc). Still, idk how many actually run that many drives these days, I might be completely off base 🤔

Edited by BVD
Link to comment
  • 4 weeks later...

Working on my array at the moment, upgrading/changing disks makes me wish that stopping the array does not mean I will lose internet if the computer isn't being reset. This makes me want to actually have additional hardware to run the internet, but then one of the point of unraid was to consolidate hardware. I really don't see why this haven't been added if the VM (really all I care about is at least pfsense or any other internet-like VM) does not have any dependencies from the array. If the reason money related, or something like that, I wouldn't mind paying for "plus" features and maybe additional features can be added in the future. Just don't make it a subscription and I'll be happy.

+1

Link to comment
  • 4 weeks later...

I run pfsense in a VM and there are definitely times that I would find it usefull to take down the array without having to lose my router. Currently it's located on the cache but I would move it to the NVMe drive in an instant if this would work.

 

I don't do it often, but there are times I seem to have to do it many times over a few days.

Link to comment
  • 2 months later...
2 hours ago, ysss said:

+1 to VMs that can stay up during array maintenance.

 

Some disk maintenance can take awhile before the array becomes available again, that's the reason I'm still keeping my home automation VM on a separate VMWare machine.

 

 

Reaffirming the home assistant note - I'd not really thought of it a great deal until recently, as we just finished purchasing and moving into our new home about 4 months ago, and have finally gotten it to the stage where we're ready to implement some of our plans for the place (connected garage door, motion detection and temp control, soil monitoring for the garden, etc)...

 

Say you suffer a double failure - you want to minimize IO during the rebuild to mitigate (as much as one can) the potential for a third failure during reconstruction, so you start the array in maintenance mode. With 16TB drives becoming more commonplace, 20TB drives now available, and 24TB+ drives on the horizon, rebuild times are only going to get longer and longer. If we average 150MB/s across the entirety of reconstruction (whichever then is wildly optimistic), you're looking at ~1.5-2 days downtime.

 

Not a huge deal for a media server. But for critical needs (like heating and cooling, home security, many others), thatd feel like a lifetime.

Link to comment
On 2/25/2022 at 7:56 AM, BVD said:

 

Reaffirming the home assistant note - I'd not really thought of it a great deal until recently, as we just finished purchasing and moving into our new home about 4 months ago, and have finally gotten it to the stage where we're ready to implement some of our plans for the place (connected garage door, motion detection and temp control, soil monitoring for the garden, etc)...

 

Say you suffer a double failure - you want to minimize IO during the rebuild to mitigate (as much as one can) the potential for a third failure during reconstruction, so you start the array in maintenance mode. With 16TB drives becoming more commonplace, 20TB drives now available, and 24TB+ drives on the horizon, rebuild times are only going to get longer and longer. If we average 150MB/s across the entirety of reconstruction (whichever then is wildly optimistic), you're looking at ~1.5-2 days downtime.

 

Not a huge deal for a media server. But for critical needs (like heating and cooling, home security, many others), thatd feel like a lifetime.

I agree. Honestly I don't see why VM's can't run without the array, I really don't. I thought about getting a separate machine, but that would defeat the point of consolidating equipment.

Link to comment
12 hours ago, JustOverride said:

Honestly I don't see why VM's can't run without the array, I really don't

 

I understand you don't see why, but you are not the developer and aware of the internals :) 

 

The array and pools can not be easily separated. First and foremost the licensing scheme is based on the number of attached devices when you start the array, this includes both array and pool devices. Unless the licensing scheme is changed, it is not possible to start pools only.

 

Secondly, all internal handling of devices need a complete rework when array and pools can start independently. This is a major change and should not be taken lightly.

 

  • Like 2
Link to comment
On 2/25/2022 at 1:56 PM, BVD said:

Not a huge deal for a media server. But for critical needs (like heating and cooling, home security, many others), thatd feel like a lifetime.

Wouldn't it be better to use a second instance from Unraid for such critical always on services?

I recently bought such a device used for about Eur. 100,- and run for example my DNS server, AdGuard, IRC and other trivial services for me on it, it consumes only about 5 to 10W and it is passive cooled.

 

It ran almost forever without a hitch on 6.8.3 until I upgraded it to 6.9.2... :D

 

If you are wondering I run this in a little odd configuration but it works, all data for the containers is stored on the "Cache" and is mirrored every week through the CA Backup plugin without compression to the "Array" and it also keeps 3 revisions from the backup:

grafik.thumb.png.b5082d5b34f6333e195fdc837284dc4f.png

 

Remember this is just an idea because I ran into the same issue as you and I know the cost of a second license and the hardware itself should be also considered but I'm really happy like it is right now with this because I was facing the same problem...

The hardware also supports virtualization as long ,as long as you don't need that many horsepower you should be good to go... :)

 

You could also run a surveillance container on it because you can use Intel QuickSync for HW transcoding and maybe connect a external USB drive mounted through Unassigned Devices for the recordings.

 

Firewall would be also possible since you have two LAN ports and could extend that for example with a inexpensive Amazon Basics USB to LAN adapter, but I really don't recommend using Unraid as a Firewall, I really don't like virtualized firewalls.

 

There are also more powerful versions out there with i3, i5 or even i7 CPUs with 4+ LAN ports eg: i5

  • Like 1
Link to comment
4 hours ago, ich777 said:

Wouldn't it be better to use a second instance from Unraid for such critical always on services?

 

@ich777 I actually went a similar route tbh - I'd had everything running in one box, and every time I had to take it down for whatever reason (adding a SAS card for the disk shelf, pulling one of the GPUs for another project, etc) it was something of a planning ordeal.

 

These days I have two unraid boxes (the primary, then my older setup acting as something of a backup vault / tinkering playground) as well as one proxmox host.

* CARP ensure's routing's always available through the pfsense VMs (one on unraid, one on proxmox)

* zfs snapshots ensure that the proxmox host's datasets are never more than 10 minutes out of date (just for those "tier 1" [home edition, lol] services)

* if the proxmox host detects any of those services are down for whatever reason,  it starts up it's own copies of them locally. Kinda stole the idea from one of @SpaceInvaderOne's old pfsense videos and adapted it.

 

There's some additional logic there to ensure that it only self initiates starting its own services if it's not already started them due to my shutting down the array on purpose, but that's the gist.

 

I just think given how rare it is for folks to be running multiple NAS/servers at home (home NAS users are already niche, but then those running multiple are a niche within a niche), I see a lot of value in the idea of allowing VM's to run with the array down. Maybe once multiple arrays become a thing, it'll make more sense for Limetech to implement some logic like 'this VM doesn't rely on data in array X, so it remains running'.

 

We'll see!

Edited by BVD
Link to comment
2 minutes ago, BVD said:

I see a lot of value in the idea of allowing VM's to run with the array down.

I really don't think this will ever happen because if you allow this the licensing model from Unraid is basically useless.

Think the other way around, a user who installs the ZFS plugin creates a zpool and/or mounts disks via Unassigned Devices and is allowed to run the VM and/or Docker without the Array started what's the point then of the license?

I think you get what my point here is...

  • Like 3
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.