Writes not going to array when cache is full.


Recommended Posts

I am getting my feet wet with unRAID in a VM with some pass-through devices, so I have nothing in this config that can't be tossed and re-built. This is all in prep before the “big move”.

 

I am using version: 6.9.2

Share setting has cache set to “Yes”

Screen shots: https://imgur.com/a/lebjslP (Main & Share Settings)

Log dump attached.

 

I have a small cache disk that is VM disk sitting on an SSD (only 5GBs for now, I intentionally made this small).

I have three spinals passed through as LUNs (3TBs each, double parity).

 

So far, everything is running great, until I enable the cache drive and start coping files. As soon as the cache disk fills I get this from rsync:

 

rsync: close failed on "/mnt/###CENSORED FILE NAME###": No space left on device (28)

rsync error: error in file IO (code 11) at receiver.c(853) [receiver=3.1.3

 

I have tried stopping the array and setting a minimum free space on the cache disk to 1GB (more than double the size of any single file I am trying to copy), but it just fills the cache drive to 100% anyway. It seems to free up space from the failed copy, but never tries moving data to the array (see 220 Mbs free in screenshot even with the 1GB minimum free).

 

Waiting a while and restarting the copy seems to work, but it seems that the minimum free space on the cache disk is just not working….

 

It was my understanding that once a cache disk is full (or in this case minimum free is reached), any future writes will go to the array directly. Is that wrong? Is this a config issue, or have I hit a bug?

 

Also, I am assuming this is happening because once a write starts to a cache disk, it can’t be moved to the array mid-copy... So a minimum free space on the cache disk should be a default config I would think (but seems to be blank).

 

As I am writing this, and after I took the log capture, I just NOW am seeing this in the logs:
Jan 16 10:55:29 Tower shfs: share cache full

### [PREVIOUS LINE REPEATED 22209 TIMES] ###

 

Could this be related to having copy on write enabled (BTRFS not letting space go fast enough)? 

 

Please advise. Thanks!!!

 

Side note, and “Anonymize diagnostics” function doesn’t seem to remove user names from the logs, so I did manually.

tower-diagnostics-20220116-1107.zip

Link to comment
19 minutes ago, BloodBlight said:

It was my understanding that once a cache disk is full (or in this case minimum free is reached), any future writes will go to the array directly. Is that wrong? Is this a config issue, or have I hit a bug?

Have you explicitly set the Mnimum Free Space for the cache as described here rather than the one at the share level that applies to array disks.

Link to comment

Yes , but with that value no file must be as big as 1GB or you will still have problems as Unraid does not take the size of a file into account when selecting the location to store it.   Unraid never switches to another location once it has selected one for a file - instead you get out-of-space errors when the file does not fit.

Link to comment

Understood.  The largest file being copied is only about 400MBs, so that should not be an issue here.  And right now, even with that set, it is sitting at 220MBs free:

Gg5nYR4.png

 

Even if you subtract the largest file in the copy (400MBs) from the 1GB, it should never go below 600MBs free...  So it seems like something isn't working right.

Link to comment

Another possibility:  If rsync doesn't let the system know how big the file is in advance (no idea if it does or doesn't), then the system doesn't know that it will "violate" the min free space and will put it there because for all it knows the file could be as small as a single byte.

Link to comment

I don't think that it does.  I am working under the assumption that at least one file will cause a policy violation before triggering the rule to store elsewhere.

 

I picked these files and sizes intentionally. The 1GB minimum free size should be able to accommodate the largest file two times over  (400MB).  So one file takes it below the 1GB mark (by even 1 byte), the OS can start writing a second file (assuming the file that would break the rule is still flushing), but by the time it gets to the third file, the first should by flushed and trigger the rule to move it down to the next tear of storage.

 

Does that line of thinking sound right?  Sorry, I am very good at breaking things. 

Link to comment

Interesting...

 

So, SSHFS, no change.

 

If I rsync files to the drive and cause the error, then do a DD to the share, it goes directly to the array.  If I delete a single 400MB file, taking the free space up from 220MBs to 620MBs, the DD will go to the cache drive even though it has less than the 1GB minimum and errors out!

Link to comment

I freed up some space, and DDing out in 100MB chunks.

It let me write to the cache drive all the way down to 113MBs...

 

But I also noticed this:

i0z1W5j.png

 

 

This got my thinking that this could be a simple math error.

 

So I adjusted the minimum cache size to 1.6 GBs (came to that with some math I can't remember) and Shazam!  No more writes to cache drive even for 1k files.  I ended up with 376 MB free when I SHOULD have no less than 1.6GBs.

 

But I re-did my tests and was able to zero it out again...  So IDK at this point...

 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.