sonofdbn Posted April 13, 2020 Share Posted April 13, 2020 My Win10 VM suddenly got disconnected, and when I checked the server dashboard, the log was at 100%. I downloaded a diagnostics file at that point. Then I rebooted and as expected, log was cleared, but now Fix Common Problems reports "Unable to write to cache" and "Unable to write to Docker Image". (I took a quick look at rTorrentVPN and while it starts, the rTorrentVPN GUI shows no activity, not even a list of torrents.) I have attached the post-reboot diagnostics file as well. What should I do next? Second - tower-diagnostics-20200413-1938.zip First - tower-diagnostics-20200413-1845.zip Quote Link to comment
JorgeB Posted April 13, 2020 Share Posted April 13, 2020 Cache filesystem is corrupt, backup, re-format pool and restore data. P.S. if not using ECC RAM good idead to run memtest since the were checksum errors. Quote Link to comment
sonofdbn Posted April 13, 2020 Author Share Posted April 13, 2020 I'm using ECC RAM, so I didn't run memtest. My problem is how to backup the files on the cache drive(s). I've tried various things to copy files off the drives, but everything I've tried throws up errors. (I've tried mc via SSH and Teracopy from my Windows PC, WinSCP and a few others.) I've also tried the CA Appdata Backup plugin, but it just flashes briefly that it's working and there's no error message, but the output folders are empty. Quote Link to comment
sonofdbn Posted April 13, 2020 Author Share Posted April 13, 2020 Unfortunately no joy. I went to the link you provided, and tried the first two approaches. My cache drives (btrfs pool) are sdd and sdb. 1) Mount filesystem read only (non-destructive) I created mount point x and then tried mount -o usebackuproot,ro /dev/sdd1 /x This gave me an error mount: /x: can't read superblock on /dev/sdd1. (Same result if I tried sdb1.) Then I tried mount -o ro,notreelog,nologreplay /dev/sdd1 /x This produced the same error. So I moved to 2) BTRFS restore (non-destructive) I created the directory /mnt/disk4/restore. Then entered btrfs restore -v /dev/sdd1 /mnt/disk4/restore After a few seconds I got this error message: /dev/sdd1 is currently mounted. Aborting. This looks odd (in that the disk is mounted and therefore presumably accessible), so I thought I should check whether I've missed anything so far. Quote Link to comment
JorgeB Posted April 13, 2020 Share Posted April 13, 2020 3 minutes ago, sonofdbn said: This looks odd (in that the disk is mounted and therefore presumably accessible), Yep, odd, please post diags. Quote Link to comment
sonofdbn Posted April 13, 2020 Author Share Posted April 13, 2020 Thanks for looking into this. Here are the diagnostics. (It's getting late here; I'll be back in the morning.) tower-diagnostics-20200414-0058.zip Quote Link to comment
JorgeB Posted April 13, 2020 Share Posted April 13, 2020 Yes, cache is showing as correctly mounted, though is completely full, you can't access via shares or console? Syslog is spammed with syslog server related errors (possibly because it's writing to cache and it ran out of space) so can't see if there are more serious issues. Quote Link to comment
sonofdbn Posted April 14, 2020 Author Share Posted April 14, 2020 I can access the cache, but copying files usually produces errors. For example, this is what I get when I try to copy \appdata\MKVToolnix to a local drive: Trying to copy the same folder to /mnt/disk4 on the server via mc on console produces an error in the same place (?): The other weirdness is that some files seem to go offline and then online again very rapidly, and there's no visible copying progress. Unfortunately I can't seem to capture the Teracopy log as text, but here's a screencap: Some files do copy OK, but backing up what I can would be almost like trying to copy each file individually and seeing which ones work. So I'm a bit stuck at the moment. Quote Link to comment
sonofdbn Posted April 14, 2020 Author Share Posted April 14, 2020 9 hours ago, johnnie.black said: Yes, cache is showing as correctly mounted, though is completely full, you can't access via shares or console? Syslog is spammed with syslog server related errors (possibly because it's writing to cache and it ran out of space) so can't see if there are more serious issues. Just to add: the cache seems to be mounted as read-only, so I can't delete anything. Is there a way to set it back to "normal" and allow me to clear some space? Quote Link to comment
JorgeB Posted April 14, 2020 Share Posted April 14, 2020 The read errors suggest cheksum errors, see if you can get the last lines of the syslog. Quote Link to comment
sonofdbn Posted April 14, 2020 Author Share Posted April 14, 2020 Here's /var/log/syslog. There's an earlier syslog.1 as well, but it's 15MB. Let me kno w if you need that as well. syslog Quote Link to comment
JorgeB Posted April 14, 2020 Share Posted April 14, 2020 Like suspected there are checksum errors, btrfs gives an i/o error when corrupted data is detected (and can't be fixed) so you know there's a problem. You can use btrfs restore (with the pool unmounted) to copy that data since it will ignore any checksum errors, but the data will still be corrupt. Quote Link to comment
sonofdbn Posted April 14, 2020 Author Share Posted April 14, 2020 So it sounds like I've pretty much lost the data on the cache drive, although I might be lucky with some files (and it would take forever to work out which files are OK). That being the case, I think I should just try to recreate the cache drive from scratch. A real pain, but doable. That being the case, is there anything wrong with the drives themselves? And do you think it was the corruption that led to the cache drive being full or the other way round? Because if it was the other way round, I need to monitor what goes on in the cache drive more carefully in future. If the corruption was the issue, any idea what caused it? Quote Link to comment
trurl Posted April 14, 2020 Share Posted April 14, 2020 5 minutes ago, sonofdbn said: the other way round My guess. You had a number of shares set cache-only for some reason (though some of those have files on the array). Do you really need all of those to stay on cache? No idea what you are using those for since their names have been anonymized. You have a pretty large cache so it should be pretty hard to fill it up. Are you trying to seed torrents from it or something? I always send torrents directly to the array. Quote Link to comment
JorgeB Posted April 14, 2020 Share Posted April 14, 2020 14 minutes ago, sonofdbn said: and it would take forever to work out which files are OK If you run a scrub it will identify all corrupt files, so any files not on that list will be OK. Quote Link to comment
sonofdbn Posted April 14, 2020 Author Share Posted April 14, 2020 6 minutes ago, trurl said: My guess. You had a number of shares set cache-only for some reason (though some of those have files on the array). Do you really need all of those to stay on cache? No idea what you are using those for since their names have been anonymized. You have a pretty large cache so it should be pretty hard to fill it up. Are you trying to seed torrents from it or something? I always send torrents directly to the array. Cache-only shares are a bit messed up because I set some shares up before I understood how the cache could be used. In reality, I have /appdata, /domains and /isos there, as well as torrents. And, yes, torrents are also seeded from cache (didn't want to keep an array drive spinning). So that's probably the cause? I thought I left myself a reasonable margin (50GB) but perhaps I wasn't paying attention. I also left too many seeding on the cache because the latest versions of unRAID unfortunately slowed down file transfers from cache to array, so I didn't do transfers out as often as I used to. Bottom line, though, is that a full BTRFS cache drive pool can be a pretty bad problem.? Is there any notification that I could have enabled? My experience is Windows, and there I usually get a disk is low on space message. Quote Link to comment
sonofdbn Posted April 14, 2020 Author Share Posted April 14, 2020 48 minutes ago, johnnie.black said: Like suspected there are checksum errors, btrfs gives an i/o error when corrupted data is detected (and can't be fixed) so you know there's a problem. You can use btrfs restore (with the pool unmounted) to copy that data since it will ignore any checksum errors, but the data will still be corrupt. Can you give more details on btrfs restore? If the pool is unmounted, how do I access the drives? Or do I unmount the drives and then mount them individually as - I don't know - unassigned devices? Quote Link to comment
trurl Posted April 14, 2020 Share Posted April 14, 2020 One thing to consider is that mover can't move open files (such as seeds). One of the reason I send torrents directly to the array. I have them on a share that only includes a single disk. But if you want to put torrents on cache, you should use a cache-prefer share so they can overflow to the array after cache gets less than Cache Minimum Free (Global Share Settings) Quote Link to comment
JorgeB Posted April 14, 2020 Share Posted April 14, 2020 8 minutes ago, sonofdbn said: Can you give more details on btrfs restore? If the pool is unmounted, how do I access the drives? Just follow the instructions on the link above. Quote Link to comment
sonofdbn Posted April 14, 2020 Author Share Posted April 14, 2020 6 minutes ago, trurl said: One thing to consider is that mover can't move open files (such as seeds). One of the reason I send torrents directly to the array. I have them on a share that only includes a single disk. But if you want to put torrents on cache, you should use a cache-prefer share so they can overflow to the array after cache gets less than Cache Minimum Free (Global Share Settings) I don't use mover at all (so not really using cache disk as a cache). I move files manually. But the cache-prefer share idea is excellent. I only "discovered" cache preferences a short time ago and didn't have a good idea of how they could be used. I didn't even know there was a Cache Minimum Free setting - but what happens when you hit the minimum free (ignoring for this discussion any cache-prefer shares)? Does this trigger a warning (and continue writing into the "buffer") or does it just act like a hard limit to the cache drive size? Quote Link to comment
trurl Posted April 14, 2020 Share Posted April 14, 2020 cache-yes and cache-prefer shares will overflow to the array when cache gets below minimum, cache-only shares have no choice but to go to cache until it runs out of space. Quote Link to comment
sonofdbn Posted April 14, 2020 Author Share Posted April 14, 2020 I guess what I'm saying is what's the use case for a Cache Minimum Free setting? If there's no temporary use of the minimum free space, isn't it just reducing the size of your cache? Quote Link to comment
trurl Posted April 14, 2020 Share Posted April 14, 2020 Cache-only shares will go beyond the mimimum. Or writes directly to cache will go beyond minimum. The purpose is to allow cache-prefer or cache-yes shares to overflow to the array so they don't go beyond minimum accidentally. Each of your user shares also has a minimum free setting. If a disk has less than minimum, Unraid will choose another disk when deciding which disk to write a file to. Note that in any case, once Unraid has decided which disk to write a file to, it will try to write the whole file to that disk, even if it runs out of space. It has no way to know in advance how large a file will become. The usual recommendation is to set minimum to larger than the largest file you expect to write to the user share (or to cache in the case of cache minimum). Quote Link to comment
sonofdbn Posted April 14, 2020 Author Share Posted April 14, 2020 23 minutes ago, trurl said: Cache-only shares will go beyond the mimimum. Or writes directly to cache will go beyond minimum. The purpose is to allow cache-prefer or cache-yes shares to overflow to the array so they don't go beyond minimum accidentally. Thanks, that's useful information. Got to re-think my setup when I eventually sort out the cache disk. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.