Unraid cache & mover, what is going on??


Dav3

Recommended Posts

I have unraid setup on an 'array' currently consisting of a 4GB HDD (no parity), formatted xfs, currently 139GB is used.  I just added an 500GB SSD as a cache drive, also formatted xfs.  I stopped VMs & docker and manually started the mover.  The cache immediately filled up to 100%.  How can this be if I only have 139GB in the array only contains 138GB of data?

 

Note: I do have a VM set up to use a 1TB SSD (configured as a qemu raw drive, not vfio controller passthrough).  It has a smallish .img file associated with the SSD that sits in the domain share.  Could this be causing the problem, acting as a proxy for the for the entire 1TB SSD and triggering a 1TB copy that fails?

 

Also, I do have the new 'pass-though=true' setting for the passthrough disk in the unassigned disks plugin (@Squid) could this be effecting the issue?

 

If it's the VM's passthrough disk that's causing the problem, what is the recommended best practice for this scenario?

 

Additionally, how does a user gein insight into what the mover is doing?  I've seen forum mentions to watch the syslog, but I'm not seeing anything there.  I've tried grepping lsio output but is there any better way?

 

Edited by Dav3
Link to comment

Well, I'm not sure what happened, but after rebooting the server mover now completes instantly.  Not sure if this is a good or bad thing, but even though all VMs & docker are stopped I still see /mnt/user/system/libvirt/libvirt.img hasn't been moved to /mnt/cache as expected.

 

Damn, this stuff is opaque.  How can I figure out what mover is & is not doing??

 

Link to comment
13 minutes ago, itimpi said:

Have you enabled the mover logging?    It is off by default.

Yeah it sounds like this is the piece of the puzzle I'm missing.

I can't find the mover settings...  Looking in the usual places, searching the web.  Where are mover settings?

 

UPDATE:  Found it.  under 'Scheduler'...  Yep, this should help.  Thanks for the tip.

 

Edited by Dav3
Link to comment

Thanks!  I just happened to be reading & considering this post right now.  It's very helpful.

 

I'm now seeing the libvirt.img file is being skipped by the mover in the syslog.  Yay I can see!

Is this 'by design or something wrong?  Are there any rules for mover 'skip' decisions besides 'file in use'?

 

Link to comment

Ok, I admit I don't understand.  Now that I have mover logging enabled, I'm testing simple scenarios and not seeing behavior I expect.

 

Currently all shares are set to cache = no.  I set share 'system' to cache = yes and trigger mover (via Main / Move Now button).  I see mover start & stop in syslog with no additoinal mover messages.  Shouldn't I  see mover copying files from /mnt/user/system to /mnt/cache/system?

 

Update, I see that switching from cache = no to cache = prefer does trigger mover copy but setting from no to yes does not. (?)

Edited by Dav3
Link to comment
12 minutes ago, Squid said:

Because VMs are enabled, which means that libvirt.img is in use, and mover cannot move any file which is in use.

Oddly, I see 'mount' returning '/mnt/disk1/system/libvirt/libvirt.img on /etc/libvirt type btrfs (rw)' even though no VMs are running...  I think this is holding the img file open.  Is this correct?  Is there some 'VMs are enabled' setting I'm missing?  I'm just stopping them.

Edited by Dav3
Link to comment
2 hours ago, Dav3 said:

A note to whoever takes care of an feeds the mover script:

 

I can see in the /usr/local/sbin/mover script that it only looks for 'prefer' not 'yes' to initiate a sync from array to cache.

 

 

 

With the current 6.9.x series of releases if you look at the Use Cache setting for a share then against that entry it tells you what action (if any) mover will take for the current setting, and that text changes dynamically as you change the value of the setting.   That was done to try and make it clearer because users never seem to read the built-in help and get confused by the meaning of the options available.

Link to comment

I see what you're saying, and I did read the forums, but the built-in help often misses my attention.  Also the use of 'cache' terminology is a bit of a misnomer, to me cache=yes implies initial synchronization similar to cache=prefer.  Like usual I over-thought the issue.  As apparently others have.  I'd suggest surfacing help a little better, perhaps in a side-bar table element.

Link to comment

6000, oh no!  I would say that I'm an "expert" in the context of this discussion.  Switching ON help by default with an acknowledgement that the user at least knows it's whereabouts and how it is there to help, believe it or not, may work. Otherwise i'm lost for words 😏

Edited by superloopy1
Link to comment
5 hours ago, superloopy1 said:

6000, oh no!  I would say that I'm an "expert" in the context of this discussion.  Switching ON help by default with an acknowledgement that the user at least knows it's whereabouts and how it is there to help, believe it or not, may work. Otherwise i'm lost for words 😏

He's being funny. Take a look at his current post count.

 

Humour typically has a grain of truth though.

  • Haha 1
Link to comment

Well, I stand by the statement that while switching from cache=off to cache=preferred works as one would expect, switching from cache=off to cache=on does not.  It's a corner-case, but an initial sync should be at least offered.  I think it's that initial transition that's confusing, not the behavior going forward.

 

Beyond that, a little criticism is probably deserved.  It's just been more than a little frustrating spending epic time troubleshooting issues in my attempt to move my development workstation platform off of Vmware Workstation and onto unraid.  I had expected to get it done over the Christmas break.  It's not a learning curve, it's been a learning El Capitan face-route climb.  Far too many times have I been sucked into the linux weeds, fun topics like dumping VGA BIOSes, iptables & go scripts.  As it is, I'm in multi-boot hell trying to get work done in the day, 'transitioning' at night & 'breaks' and not enjoying the experience although when the virtualization stuff works it really shows great potential.  My wife & kids (& sleep!) have been the ultimate losers.

 

In hindsight, I think my mistake was trying to shoehorn what is (or was) essentially designed to be a media builk-storage NAS product into being a workstation virtualization platform.  And my multi-boot 'solution', expecting a short transition period, was woefully mistaken.

 

Nuf said, what I think obviously isn't going to change anything, it's an unproductive religious argument, so on to more productive things...

 

 

Link to comment
1 hour ago, Dav3 said:

switching from cache=off to cache=preferred works as one would expect, switching from cache=off to cache=on does not. 

I assume here you meant switching from cache-no to cache-yes. Arguably cache-yes and cache-prefer are both cache on. In what way does cache-yes not work as expected? It writes new files to cache and moves them to the array.

 

Yes was the original use of cache. Its purpose was to speed up writes to the server by going to a disk that wasn't affected by parity writes, and then later moving to the parity-protected array.

 

1 hour ago, Dav3 said:

I think it's that initial transition that's confusing, not the behavior going forward.

It is important to be aware that ALL user share settings are "the behavior going forward". Nothing is done to existing files.

Link to comment
15 minutes ago, trurl said:

Nothing is done to existing files.

Nit to pick. Files existing in a share currently set to cache no and cache only will get moved if they are on the "wrong" disk when changing to cache yes or prefer. I think that is the transition the OP is complaining about. I believe he wants the move to work the other way as well, so all existing content is forced to follow the new rule.

 

However... I purposely use cache no and only for specific shares that I want to stay on both cache and array. My domains share is set cache only, and I manually move seldom used vdisks to the array, and new VM's I define show up on the cache. You could use cache no for a situation where you want new content to end up on the array, but cherry pick some files you want to live on the cache for speed.

 

tldr; The existing behaviour evolved over many years, there is logic to it all, but the descriptions can be clumsy and hard to understand until you dig into it.

Edited by jonathanm
Finished my thought.
Link to comment
7 minutes ago, jonathanm said:

Nit to pick.

Yes I wasn't clear about that. Mover moves existing files (how could it move non-existing files?) depending on the Use cache setting.

 

2 minutes ago, jonathanm said:

The existing behaviour evolved over many years, there is logic to it all, but the descriptions can be clumsy and hard to understand until you dig into it.

One word (no,yes,prefer,only) can't possibly give enough information to understand.

 

4 minutes ago, jonathanm said:

I purposely use cache no and only for specific shares that I want to stay on both cache and array. My domains share is set cache only, and I manually move seldom used vdisks to the array, and new VM's I define show up on the cache. You could use cache no for a situation where you want new content to end up on the array, but cherry pick some files you want to live on the cache for speed.

The link I gave earlier even talks about these uses, describing them as "orphaned files". Maybe you were the one who first pointed that out, don't remember.

 

Link to comment
  • 1 year later...

I had also the problem, that my disks were always spun up. Then I realized, that two files from system share are held on my array:

/mnt/disk1/system/docker/docker.img

/mnt/disk1/system/libvirt/libvirt.img

 

To check this, go to SHARES->system. There are two directories: docker and libvirt. If content of both directories is completely in cache, then in LOCATION column stays "cache". If there is "cache, disk", then some files from this directory are also on disk. If they are on disk, then running docker containers or virtual machines will keep the disks spinning up.

 

To move them to cache:

  1. I had to disable docker service (Settings->Docker->Enable Docker: No) and VMs (Settings->VM Manager->Enable VMs: No)
  2. I had to invoke the Mover (Settings->Scheduler->Mover Settings->Move now)

The cache setting of "system" share has to be set to "Prefer : Cache" to be able to move.

After the files have been moved, I started again dockers service and VMs.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.