Unable to write to cache and docker image


Recommended Posts

My Win10 VM suddenly got disconnected, and when I checked the server dashboard, the log was at 100%. I downloaded a diagnostics file at that point. Then I rebooted and as expected, log was cleared, but now Fix Common Problems reports "Unable to write to cache" and "Unable to write to Docker Image". (I took a quick look at rTorrentVPN and while it starts, the rTorrentVPN GUI shows no activity, not even a list of torrents.) I have attached the post-reboot diagnostics file as well.

 

What should I do next?

Second - tower-diagnostics-20200413-1938.zip First - tower-diagnostics-20200413-1845.zip

Link to comment
  • Replies 54
  • Created
  • Last Reply

Top Posters In This Topic

Top Posters In This Topic

Posted Images

I'm using ECC RAM, so I didn't run memtest.

 

My problem is how to backup the files on the cache drive(s). I've tried various things to copy files off the drives, but everything I've tried throws up errors. (I've tried mc via SSH and Teracopy from my Windows PC, WinSCP and a few others.) I've also tried the CA Appdata Backup plugin, but it just flashes briefly that it's working and there's no error message, but the output folders are empty.

Link to comment

Unfortunately no joy. I went to the link you provided, and tried the first two approaches. My cache drives (btrfs pool) are sdd and sdb.

 

1) Mount filesystem read only (non-destructive)

I created  mount point x and then tried

mount -o usebackuproot,ro /dev/sdd1 /x

This gave me an error

mount: /x: can't read superblock on /dev/sdd1.

(Same result if I tried sdb1.) Then I tried

mount -o ro,notreelog,nologreplay /dev/sdd1 /x

This produced the same error.

 

So I moved to

2) BTRFS restore (non-destructive)

 

I created the directory /mnt/disk4/restore. Then entered

btrfs restore -v /dev/sdd1 /mnt/disk4/restore

After a few seconds I got this error message:

/dev/sdd1 is currently mounted.  Aborting.

This looks odd (in that the disk is mounted and therefore presumably accessible), so I thought I should check whether I've missed anything so far.

Link to comment

I can access the cache, but copying files usually produces errors. For example, this is what I get when I try to copy \appdata\MKVToolnix to a local drive:

 

image.png.7faaada7b8289f8f95131d7556434ce8.png

 

Trying to copy the same folder to /mnt/disk4 on the server via mc on console produces an error in the same place (?):

 

image.thumb.png.916959f48d5147b5356b62f415e99ee8.png

 

The other weirdness is that some files seem to go offline and then online again very rapidly, and there's no visible copying progress. Unfortunately I can't seem to capture the Teracopy log as text, but here's a screencap:

 

image.png.293577b1b44e94397f9b50a0b7df7f4b.png

 

Some files do copy OK, but backing up what I can would be almost like trying to copy each file individually and seeing which ones work. So I'm a bit stuck at the moment.

Link to comment
9 hours ago, johnnie.black said:

Yes, cache is showing as correctly mounted, though is completely full, you can't access via shares or console?

 

Syslog is spammed with syslog server related errors (possibly because it's writing to cache and it ran out of space) so can't see if there are more serious issues.

Just to add: the cache seems to be mounted as read-only, so I can't delete anything. Is there a way to set it back to "normal" and allow me to clear some space?

Link to comment

Like suspected there are checksum errors, btrfs gives an i/o error when corrupted data is detected (and can't be fixed) so you know there's a problem.

 

You can use btrfs restore (with the pool unmounted) to copy that data since it will ignore any checksum errors, but the data will still be corrupt.

Link to comment

So it sounds like I've pretty much lost the data on the cache drive, although I might be lucky with some files (and it would take forever to work out which files are OK).

 

That being the case, I think I should just try to recreate the cache drive from scratch. A real pain, but doable. That being the case, is there anything wrong with the drives themselves?

 

And do you think it was the corruption that led to the cache drive being full or the other way round? Because if it was the other way round, I need to monitor what goes on in the cache drive more carefully in future. If the corruption was the issue, any idea what caused it? 

Link to comment
5 minutes ago, sonofdbn said:

the other way round

My guess.

 

You had a number of shares set cache-only for some reason (though some of those have files on the array). Do you really need all of those to stay on cache?

 

No idea what you are using those for since their names have been anonymized. You have a pretty large cache so it should be pretty hard to fill it up. Are you trying to seed torrents from it or something? I always send torrents directly to the array.

Link to comment
6 minutes ago, trurl said:

My guess.

 

You had a number of shares set cache-only for some reason (though some of those have files on the array). Do you really need all of those to stay on cache?

 

No idea what you are using those for since their names have been anonymized. You have a pretty large cache so it should be pretty hard to fill it up. Are you trying to seed torrents from it or something? I always send torrents directly to the array.

Cache-only shares are a bit messed up because I set some shares up before I understood how the cache could be used. In reality, I have /appdata, /domains and /isos there, as well as torrents.

 

And, yes, torrents are also seeded from cache (didn't want to keep an array drive spinning). So that's probably the cause? I thought I left myself a reasonable margin (50GB) but perhaps I wasn't paying attention. I also left too many seeding on the cache because the latest versions of unRAID unfortunately slowed down file transfers from cache to array, so I didn't do transfers out as often as I used to.

 

Bottom line, though, is that a full BTRFS cache drive pool can be a pretty bad problem.? Is there any notification that I could have enabled? My experience is Windows, and there I usually get a disk is low on space message.

Link to comment
48 minutes ago, johnnie.black said:

Like suspected there are checksum errors, btrfs gives an i/o error when corrupted data is detected (and can't be fixed) so you know there's a problem.

 

You can use btrfs restore (with the pool unmounted) to copy that data since it will ignore any checksum errors, but the data will still be corrupt.

Can you give more details on btrfs restore? If the pool is unmounted, how do I access the drives? Or do I unmount the drives and then mount them individually as - I don't know - unassigned devices?

Link to comment

One thing to consider is that mover can't move open files (such as seeds). One of the reason I send torrents directly to the array. I have them on a share that only includes a single disk.

 

But if you want to put torrents on cache, you should use a cache-prefer share so they can overflow to the array after cache gets less than Cache Minimum Free (Global Share Settings)

Link to comment
6 minutes ago, trurl said:

One thing to consider is that mover can't move open files (such as seeds). One of the reason I send torrents directly to the array. I have them on a share that only includes a single disk.

 

But if you want to put torrents on cache, you should use a cache-prefer share so they can overflow to the array after cache gets less than Cache Minimum Free (Global Share Settings)

I don't use mover at all (so not really using cache disk as a cache). I move files manually. But the cache-prefer share idea is excellent. I only "discovered" cache preferences a short time ago and didn't have a good idea of how they could be used.

 

I didn't even know there was a Cache Minimum Free setting - but what happens when you hit the minimum free (ignoring for this discussion any cache-prefer shares)? Does this trigger a warning (and continue writing into the "buffer") or does it just act like a hard limit to the cache drive size?

Link to comment

Cache-only shares will go beyond the mimimum. Or writes directly to cache will go beyond minimum. The purpose is to allow cache-prefer or cache-yes shares to overflow to the array so they don't go beyond minimum accidentally.

 

Each of your user shares also has a minimum free setting. If a disk has less than minimum, Unraid will choose another disk when deciding which disk to write a file to.

 

Note that in any case, once Unraid has decided which disk to write a file to, it will try to write the whole file to that disk, even if it runs out of space. It has no way to know in advance how large a file will become.

 

The usual recommendation is to set minimum to larger than the largest file you expect to write to the user share (or to cache in the case of cache minimum).

 

 

Link to comment
23 minutes ago, trurl said:

Cache-only shares will go beyond the mimimum. Or writes directly to cache will go beyond minimum. The purpose is to allow cache-prefer or cache-yes shares to overflow to the array so they don't go beyond minimum accidentally.

 

Thanks, that's useful information. Got to re-think my setup when I eventually sort out the cache disk.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.