Cache shares and disk unavailable (Input/Output error)

DCWolfie · July 14, 2022

Hey, new Unraid guy here. I encountered a power loss last night. When I booted up the server this morning the parity check started (ofc), however. I noticed the appdata and system shares was missing completely. I have them both set to Cache: prefer. I then checked my cache disk through terminal and it gave me "/bin/ls: cannot access 'cache': Input/output error". I found a post where someone suggested running check filesystem against the disk. However this requires me to start up the array. I'm getting a bit frantic now as I also found out that cache drives are not included in the parity.

Q1: Can I pause the parity, try to fix the cache drive and eventually start the parity check later on? Nothing really ran when the power loss happened as it was at 4am. The only thing running was Docker with a couple of containers, so I'm guessing the cache drive is the only who could have done any work at the time everything got shut down. I can add that I do see all my other data. With a glance it looks okey.

Q2: Could someone explain to me what's going on here? I'm a "biiiit" lost to what has happened.

Added diagnostic files in case there are needed.

unraidhub-diagnostics-20220714-1141.zip

JorgeB · July 14, 2022

10 minutes ago, DCWolfie said:

Q1: Can I pause the parity, try to fix the cache drive and eventually start the parity check later on?

Yes, but you'll need to start over.

Check filesystem on cache.

DCWolfie · July 14, 2022

Replying for historical purposes:

Ran it with -n got a mile long list. Removed parameter and ran again and then I got:

Phase 1 - find and verify superblock...
Phase 2 - using internal log
        - zero log...
ERROR: The filesystem has valuable metadata changes in a log which needs to
be replayed.  Mount the filesystem to replay the log, and unmount it before
re-running xfs_repair.  If you are unable to mount the filesystem, then use
the -L option to destroy the log and attempt a repair.
Note that destroying the log may cause corruption -- please attempt a mount
of the filesystem before doing this.

Started up the array, but got Unmountable: Wrong or no file system. Went back into maintenance mode, ran it with -L as suggested, then without parameters again. Started up the array normally and the Cache disk mounted again. Found one empty file (named 201368677) within lost+found, not sure if I lost anything as I wiped the logs, but it seems to work fine now.

Side question: As we're talking about parity and stuff. As I now has Cache: Preferred for appdata, do I manually need to take backup of this data to a disk which is covered by the parity disk? As I see it, if the cache disk suddenly dies I have no way of recovering what it contained.

JorgeB · July 14, 2022

Anything on cache won't be covered by parity, also note that parity doesn't protect against file system corruption.

DCWolfie · July 14, 2022

Ah, okey that's nice to know. Though I'm finding it hard to understand what's the best practice is here. I looked into CA Appdata backup which backups my persistent docker data, however it says Source cannot be /mnt/user or /mnt/disk. When using Cache: Preferred my data can both live on the cache and on an array disk. I feel like I then should use Cache: Only on these shares, though I see people telling me NOT to use that option. I'm getting a bit confused how I should have this set up. Do you have any thoughts/suggestions? As cache don't just mirror the array disk, wouldn't it be the same for me to just remove cache, add it as an array disk and force all my appdata to live on that? That way I have the speed of the SSD and the protection of the parity?

itimpi · July 14, 2022

6 minutes ago, DCWolfie said:

When using Cache: Preferred my data can both live on the cache and on an array disk.

Note that this is an OR option, and that mover will try and move anything off the array and onto the cache so it would not make sense to do this.

nothing wrong with having appdata as Use Cache=only once you have ensured it is all off the main array. The reason that many people use the Prefer option is that allows for overflow to the array of the cache gets full, and auto-move back if space later becomes available. If using the Prefer option then make sure you have a sensible value for the Minimum Free Space setting on the cache so Unraid knows when to start overflowing to the array..

There is nothing to stop you having appdata purely on the array, but most people do not want this as you normally get much better performance if it is on a pool/cache.

DCWolfie · July 14, 2022

12 minutes ago, itimpi said:

Note that this is an OR option, and that mover will try and move anything off the array and onto the cache so it would not make sense to do this.

Hm... I don't feel like I see that being the case here. As of writing I have just a small M.2 assigned to cache. It has ~5GB available space, while the array disk its caching has 500MB of data. Meaning my appdata lives on both disks (I do not have a minimum space set on the cache). So given it's either cache or array (if it has available space) shouldn't this have been moved over?

12 minutes ago, itimpi said:

nothing wrong with having appdata as Use Cache=only once you have ensured it is all off the main array. The reason that many people use the Prefer option is that allows for overflow to the array of the cache gets full, and auto-move back if space later becomes available. If using the Prefer option then make sure you have a sensible value for the Minimum Free Space setting on the cache so Unraid knows when to start overflowing to the array..

Ah okey. That's good to know. I see, but how do those people handle backups like the one we're talking about? If it overflows the data would no longer only be on the cache disk and plugins like CA Appdata Backup don't allow to backup from array disks. Maybe they just don't do it?
Absolutely. I got to get a bigger drive if I'm gonna use the cache.

12 minutes ago, itimpi said:

There is nothing to stop you having appdata purely on the array, but most people do not want this as you normally get much better performance if it is on a pool/cache.

But wouldn't you just get the speed of the disk itself? Or is it some magic trickery Unraid does when it's within a cache pool?

Edited July 14, 2022 by DCWolfie

itimpi · July 14, 2022

11 minutes ago, DCWolfie said:

But wouldn't you just get the speed of the disk itself? Or is it some magic trickery Unraid does when it's within a cache pool?

Only for reads. If appdata is 0n the array then all writes are slowed by the requirement to keep parity updated as well.

12 minutes ago, DCWolfie said:

(I do not have a minimum space set on the cache)

This can cause problems for two reasons:

if the cache gets really full then it is more likely to get file system level corruption.
Unraid picks the target drive for a new file before it knows how big it is going to be. If it subsequently fails to fit you get a write error (I.e. Unraid does not change its mind). You need a Minimum Free Space on the cache that is larger than the largest file you expect to write to avoid this scenario and gracefully switch to bypass the cache and using the array when the free space drops below this value.

DCWolfie · July 14, 2022

3 minutes ago, itimpi said:

Only for reads. If appdata is 0n the array then all writes are slowed by the requirement to keep parity updated as well.

Ah! Gotcha, that makes sense.

3 minutes ago, itimpi said:

This can cause problems for two reasons:

if the cache gets really full then it is more likely to get file system level corruption.

Unraid picks the target drive for a new file before it knows how big it is going to be. If it subsequently fails to fit you get a write error (I.e. Unraid does not change its mind). You need a Minimum Free Space on the cache that is larger than the largest file you expect to write to avoid this scenario and gracefully switch to bypass the cache and using the array when the free space drops below this value.

Yeah, I'm going to add it. Just forgotten it 😬. Aha. I feel that makes sense. Thanks a lot.

I also would like to add that I've been thinking wrong about the appdata backup issue. I failed to remember that Unraid handles at which disk holds my data. I don't need to specify diskX. I could just backup /mnt/user/appdata and I would be fine. So yeah, I'll just keep continue using cache ☺️. Again, thanks a lot, to both of you.

Edited July 14, 2022 by DCWolfie

Kilrah · July 14, 2022

I use Appdata Backup to backup all the appdata daily to a "backup" share that's set to cache:no so it's on the array, a db-backup container that adds a dump of my mysql DBs, plus a custom user script that backs up my separate nextcloud data share into there as well. Then I have a Luckybackup schedule to dump all that backup share onto a remote machine daily, and an Unassigned Devices script to do the same to an external drive whenever I plug it in once in a while.

As mentioned people tend to overestimate parity - it only protects against an outright drive failure but not against any other form of data/filesystem corruption, accidental deletion, ransomware or the likes, so anything important needs separate backups according to the usual best practices...

Edited July 14, 2022 by Kilrah

DCWolfie · July 14, 2022

4 minutes ago, Kilrah said:

I use Appdata Backup to backup all the appdata daily to a "backup" share that's set to cache:no so it's on the array, plus a custom user script that backs up my separate nextcloud data share into there as well. Then I have a Luckybackup schedule to dump that onto a remote machine daily as well, and an Unassigned Devices script to to the same to an external drive whenever I plug it in once in a while.

As mentioned people tend to overestimate parity - it only protects against an outright drive failure but not against any other form of data/filesystem corruption, accidental deletion, ransomware or the likes, so anything important needs separate backups according to the usual best practices...

Ah, you're actually one of those who follows the rules of backup 😄. Yeah, I know. But right now, one step at the time. First I got to get this all up and running how I like then I'll look into remote/local backups. Thanks for the tip on Luckybackup thp, haven't heard about it before.

Yeah, I kind of know the limitations. It's basically drive failures I'm worried about most, though those are defiantly important aspects I will need to look into in the future.

trurl · July 14, 2022

5 minutes ago, DCWolfie said:

drive failures I'm worried about most, though those are defiantly important aspects I will need to look into in the future.

Those other things are more common than drive failures.

7 minutes ago, DCWolfie said:

one step at the time

You should always have another copy of anything important and irreplaceable, that should be the first step.

DCWolfie · July 14, 2022

Just now, trurl said:

Those other things are more common than drive failures.

😬 so in order to be safe against those aspects the solution is backup backup backup?

Just now, trurl said:

You should always have another copy of anything important and irreplaceable, that should be the first step.

Yeah, 100%. Though, what I have on there is replaceable. Just a hassle to replace.

Cache shares and disk unavailable (Input/Output error)

Recommended Posts

DCWolfie

Link to comment

JorgeB

Link to comment

DCWolfie

Link to comment

JorgeB

Link to comment

DCWolfie

Link to comment

itimpi

Link to comment

DCWolfie

Link to comment

itimpi

Link to comment

DCWolfie

Link to comment

Kilrah

Link to comment

DCWolfie

Link to comment

trurl

Link to comment

DCWolfie

Link to comment

Join the conversation