Jump to content

Cache drive errors. What steps should I take?


Go to solution Solved by JorgeB,

Recommended Posts

Happy new year! I have tried looking up these errors and am not sure exactly what steps to take so I have turned to the forums. 

 

Setup

 

I am running - Unraid Version: 6.11.5

I am using - Community Applications, Dynamix File Manager, My Servers and User Scripts.

Docker - Cloudflare-DDNS, mariadb, nextcloud, plex, swag and vaultwarden.

 

I've got two Sabrent Rocket 4.0 1TB's running in raid 1 as cache.

X3 18TB Seagate Exos HDD'S - in dual parity (plan to add more in the future).

 

The Problem

 

Last month I was listening to music on plexamp and it suddenly stopped. I checked the server and it said cache drive nvme0n1 was missing. Thoughtlessly I decided to reboot the system (Through the Unraid option). Once it had booted back up the drive appeared again and that's when I decided to get a second nvme and run them in raid 1. After rebooting again the drive showed up as missing, after a few more reboots (I realise this probably isn't too smart) The drive shows up again. This is still the case on reboots now. When the server starts back up, sometimes the nvme is missing sometimes its not.

 

I checked my syslog and it keeps showing errors on my cache drive (See image)image_2023-01-01_133617644.thumb.png.de0c787a311c5b97801dd037511872e7.png

 

I searched this error and found a post saying to run this command (btrfs dev stats /mnt/cache) so I did (See image)

signal-2023-01-01-125900_002.png.fcd124fdaf6b5b34b9e72dc84fc8a375.png

 

The post said all of the above is meant to show 0. 

 

Plan

From my understanding this is a problem with the cache drive itself and I am planning on replacing the nvme. Is this the correct thing to do or is there something I can to do in software? 

 

Thank you for your time - Ashley

pacific-diagnostics-20230101-1312.zip

Link to comment

Alright, will do! Just one more thing, is this a btrfs, nvme or me being silly issue? From what I've seen it seems that people with similar issues have swapped to xfs as a "solution".

 

Thank you for your help, I checked your profile and the amount of people you help is crazy! Thanks again 😄

Link to comment
  • 8 months later...

Hi. I wanted to give a heads-up to anyone experiencing this problem from time to time.

Problem:

My cache pool consists of 3 SSDs, 2 of which are fairly old (lived through 2 laptops), and I had a problem of one of the SSDs being disconnected mid pool operation, and errors starting to accumulate, until I reboot the system and run scrub. 

Suggested solution / investigation:

Some forum posts suggested that the issue lies in the SSDs themselves, cables, or even the backplane of the motherboard. I swapped cables, tried 2 different PCIe - SATA adapters, nothing helped.

Actual solution (in my specific case):

TLDR - I bought and installed a UPS for the server. All the errors stopped appearing.

Longer version - I  noticed that my 3d printer sometimes had a horizontal shift in the prints in one of the directions. That's apparently a sign of a short power outage (possibly due to voltage fluctuations) - not so long that the printer would stop and force me to restart the print, but long enough for it to lose its position. I decided that losing my data because of a voltage spike would defeat the purpose of a NAS, so I bought a UPS (first one that I found that satisfied my wattage needs - APC Back UPS BX - BX950MI-GR - 950 VA). Errors didn't appear ever since, and as an added benefit - it has a battery, and UnRaid can safely shutdown in the event of power outage if you connect UPS to the server via USB.

Link to comment
  • 1 month later...

This sometimes helps, on the main GUI page click on the flash drive, scroll down to "Syslinux Configuration", make sure it's set to "menu view" (top right) and add this to your default boot option, after "append initrd=/bzroot"

nvme_core.default_ps_max_latency_us=0 pcie_aspm=off

e.g.:

append initrd=/bzroot nvme_core.default_ps_max_latency_us=0 pcie_aspm=off


Reboot and see if it makes a difference.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...