Skip to content
View in the app

A better way to browse. Learn more.

Unraid

A full-screen app on your home screen with push notifications, badges and more.

To install this app on iOS and iPadOS
  1. Tap the Share icon in Safari
  2. Scroll the menu and tap Add to Home Screen.
  3. Tap Add in the top-right corner.
To install this app on Android
  1. Tap the 3-dot menu (⋮) in the top-right corner of the browser.
  2. Tap Add to Home screen or Install app.
  3. Confirm by tapping Install.

cache pool errors - failing drive or ??

Featured Replies

Recently, my cache drive has begun to report errors. I thought perhaps the NVMe drive was beginning to fail (Samsung 980 Pro 1TB), but it passes self tests.
I don't want to start a warranty claim process if it's not the drive. Any way to find out what's happening? The error messages aren't very helpful, at least not to me.
See attached screenshots.
EDIT: SMART report added also.
EDIT2: added obligatory diagnostics. ;)

EDIT3/Resolution: it was the drive. Samsung replaced it under warranty.

cap 003.JPG

cap 004.JPG

cap 006.JPG

cap 007.JPG

cap 008.JPG

tower-smart-20260606-1008.zip tower-diagnostics-20260606-1025.zip

Edited by Elmojo
resolution

Solved by Vvei_61

  • Author

crickets....
No one? 😅

  • Solution

Your drive is failing. The self-test passing doesn't mean it's healthy – it only checks if the drive responds, not whether all sectors are readable.

Look at the SMART error log: 399+ Unrecovered Read Errors, all pointing to the same LBA range. That's permanent bad sectors. Back up your data now, this can fail completely without any further warning.

Frank

  • Author
3 minutes ago, Vvei_61 said:

Your drive is failing

Thanks, but I'm not 100% sure I believe that. Browsing the unraid subreddit, it appears that there have been quite a few of us who are suddenly getting "corruption" errors on cache drives/pools since updating to 7.3.
That seems very suspicious to me. It could be that now it's doing a better job of scanning drive health, but having all these drives 'fail' at once, and in the same way, just feels odd.
I will absolutely backup and scan the drive more deeply. I'll also prepare to replace it, just in case.
Speaking of, what's the easiest way to temporarily move everything off the cache and onto the array? Should I just edit the share settings, or is there a better way?
I've been running this machine for several years, but there are still many things for which I'm a total beginner. :)

The Unraid 7.3 correlation is interesting and worth keeping an eye on. But Unrecovered Read Errors in the SMART log are written by the drive itself, not by Unraid. Software can't generate those entries.

For moving data off the cache: yes, editing the share settings is the easiest way. Set the Use Cache option to "No" or "Prefer Array" on your shares, then run the mover. Everything will transfer to the array automatically.

  • Community Expert

It is worth pointing out that if you want to use a self-test on a drive then only the extended self-test is really indicative of health. The short test is primarily about testing the electronics - only a small number of sectors are tested.

  • Author
3 hours ago, Vvei_61 said:

Set the Use Cache option to "No" or "Prefer Array" on your shares, then run the mover. Everything will transfer to the array automatically.

Awesome, thanks. It appears that I may need to do that. See below...

2 hours ago, itimpi said:

It is worth pointing out that if you want to use a self-test on a drive then only the extended self-test is really indicative of health.

Good to know, thanks! I ran the long SMART test and it reported "Completed: failed segments". The report is attached.
It seems there are quite a few "Unrecovered Read Error" entries.
I probably need to initiate a warranty replacement with Samsung. Bugger.

tower-smart-20260606-1955.zip

  • Community Expert
5 hours ago, Elmojo said:

"Completed: failed segments"

That means the test failed, and the device should be replaced.

  • Author
17 hours ago, Elmojo said:

Set the Use Cache option to "No" or "Prefer Array" on your shares, then run the mover. Everything will transfer to the array automatically.

So...I don't have those options. I can change the Primary Storage from 'cache' to 'array', and leave the secondary storage to 'none'.
However, running mover does nothing. No disk activity is observed, and no files are moved off the cache drive.

cap 003.JPG

cap 004.JPG

  • Community Expert

If you want mover to do anything you need to have both primary and secondary storage set and the appropriate mover direction.

  • Author
1 hour ago, itimpi said:

If you want mover to do anything you need to have both primary and secondary storage set and the appropriate mover direction.

I'm super confused then. I've had it set that way since the beginning, and it's never moved anything from the cache to the array. I assumed it was because the cache hadn't reached the selected fill level.
That also means that the advice given above by Vvei_61: "Set the Use Cache option to "No" or "Prefer Array" on your shares, then run the mover. Everything will transfer to the array automatically." is incorrect, since I don't even have those options.
So what is the proper process for getting the data off this cache so I can replace it? Someone please walk me through the steps. I'm getting concerned I may lose data. The errors are over 1500 now. /

You need both storages set: Primary = Cache, Secondary = Array, and Move action = Cache → Array. With Secondary set to "none" the mover has nowhere to send the files.

Make sure to do this on all shares that use the failing cache drive, then run the mover.

Frank

3 minutes ago, Vvei_61 said:

You need both storages set: Primary = Cache, Secondary = Array, and Move action = Cache → Array. With Secondary set to "none" the mover has nowhere to send the files.

Make sure to do this on all shares that use the failing cache drive, then run the mover.

Looking at the top screenshot, your settings actually look correct already - Primary = Cache, Secondary = Array, Move action = Cache → Array.

Did you start the mover manually? It won't run automatically unless scheduled. And did you get any error message when you ran it?

Frank

  • Author
7 hours ago, Vvei_61 said:

Looking at the top screenshot, your settings actually look correct already - Primary = Cache, Secondary = Array, Move action = Cache → Array.

Did you start the mover manually? It won't run automatically unless scheduled. And did you get any error message when you ran it?

Yeah, that's why I'm so confused. I have it set as shown, but when I invoke the mover manually, nothing happens. No errors. I get the little unraid 'wave' icon for a moment, then it goes back to the previous screen, like it's done.
It's always done this, so I just assumed that was normal behavior until the cache reached the threshold of fill for when it was supposed to 'spill over' onto the array. If not, then it's never worked correctly since day-one.
How do I troubleshoot this, to figure out why the mover...isn't? lol

  • Author
10 hours ago, JorgeB said:

Thank you JorgeB! It seems that the first steps of manually stopping the docker and VM services is key.
After doing that, mover is at least reporting as active. However, it's running very slooowwly. I mean like the transfer rates are in the KBs. O.o
Should I just leave it and see what happens? I hate to have my server offline for so long, especially not knowing if this might be a multi-day process at these speeds....
EDIT: I have 725GB of data to move. At the currently reported speeds, this will be a process of weeks. Something has to change.

Edited by Elmojo
added data total info

  • Author

So it appears that the mover was working, it just took a while. It has now moved everything off of cache, except for ONE of my VM folders. The VM manager is disabled, so obviously no VMs are running. Why will this one not move, or how can I find out why it's not? Can I move it manually? If so, how?

I finally got it to move manually.

Now, the lingering issue is that when I restart the docker and VM services, it recreates the appdata folder on the cache. I've confirmed that the appdata share is set to "array" only.
How can I get all the docker containers to stop writing to the cache?

Edited by Elmojo

  • Community Expert

You need to check the drive mappings for your containers (and docker itself) to ensure there are no references to /mnt/cache/appdata.

  • Author
1 hour ago, itimpi said:

You need to check the drive mappings for your containers (and docker itself) to ensure there are no references to /mnt/cache/appdata.

Oh lordy, individually for each container?!
And I guess I'll have to map them all back once I've replaced the cache drive?
This really feels like something that should be done automatically when the share settings are changed...

7 hours ago, JorgeB said:

Post new diags please

Attached...

tower-diagnostics-20260609-0937.zip

  • Community Expert

Shares are correct, so it's mostly what Itimpi mentioned.

  • Author
12 minutes ago, JorgeB said:

Shares are correct, so it's mostly what Itimpi mentioned.

Well that sucks.
Also, my VMs (some of them) are broken now. The disk image can't be read in the new location (array) so the machine won't boot.
I have no idea what to do about this. One of these VMs runs my security cameras, so it's fairly urgent to get it running again...

  • Community Expert
37 minutes ago, Elmojo said:

Oh lordy, individually for each container?!
And I guess I'll have to map them all back once I've replaced the cache drive?
This really feels like something that should be done automatically when the share settings are changed...

Your docker.img and domain.img specify user shares (appdata, domains, system), not specific drives or pools, so it does work automatically if you set those shares to be on the array, and as long as you haven't specified a particular drive or pool for any docker or VM.

Then you set them back to prefer cache to get them moved back as explained at that link.

  • Community Expert
2 hours ago, Elmojo said:

And I guess I'll have to map them all back once I've replaced the cache drive?

If you have containers mapped to use /mnt/user/appdata then they will not need changing again as /mnt/cache/appdata is part of /mnt/user/appdata.

Direct mapping to /mnt/cache/? Locations used to be a way of getting better performance because it by-passed the Fuse layer typically involved in user shares. However current releases of Unraid have the ‘exclusive’ option for shares that are all on one device that also bypasses fuse thus achieving the same performance benefit but still use a path that is not device specific.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

Account

Navigation

Search

Search

Configure browser push notifications

Chrome (Android)
  1. Tap the lock icon next to the address bar.
  2. Tap Permissions → Notifications.
  3. Adjust your preference.
Chrome (Desktop)
  1. Click the padlock icon in the address bar.
  2. Select Site settings.
  3. Find Notifications and adjust your preference.