Skip to content
View in the app

A better way to browse. Learn more.

Unraid

A full-screen app on your home screen with push notifications, badges and more.

To install this app on iOS and iPadOS
  1. Tap the Share icon in Safari
  2. Scroll the menu and tap Add to Home Screen.
  3. Tap Add in the top-right corner.
To install this app on Android
  1. Tap the 3-dot menu (⋮) in the top-right corner of the browser.
  2. Tap Add to Home screen or Install app.
  3. Confirm by tapping Install.

970 Evo M2 SSD cache pool filled with errors [BTRFS]

Featured Replies

hey all,

 

First time having a potential hardware issue, so I apologize if I missed certain prerequisite details.

 

My server has been running just fine for a long time, even replaced a disk a few days back by upgrading parity and using the parity disk as a disk drive.  Everything has been great even after that. But last night, docker started acting weird and I noticed errors in my cache pool when I downloaded the SMART report. i ran a memtest which went on for 13 hours for 4 pass, and it passed  so I guess the 4 RAM sticks are fine.

 

The vdisks aren't able to copy into my array after a certain point so I guess they are corrupted.  Good thing my appdata backs up every month using CA Backup. I just want to try to understand if my M.2 drives are going bad and I should purchase some new ones, or should i try my luck by reformatting and using them back again. Attached diagnostic after running scrub (which couldnt correct any of the errors) and also a command which JorgeB recommends to run (shows a whole lot of errors!)

 

Also attached SMART report from both cache disks. I cant do any cable checks since they are M.2 connected directly to the motherboard. I have a third drive used as an unassigned disk, which seems to be fine for now (ADATA drive bought a few years after these two)

 

from scrub page  

UUID:             d7811189-42b8-4d37-a4d0-dae7ee9e73f6
Scrub started:    Tue Aug 15 21:03:35 2023
Status:           aborted
Duration:         0:09:38
Total to scrub:   478.01GiB
Rate:             833.38MiB/s
Error summary:    read=135
  Corrected:      0
  Uncorrectable:  135
  Unverified:     0a


 

root@Tower:~# btrfs dev stats /mnt/cache
[/dev/nvme0n1p1].write_io_errs    0
[/dev/nvme0n1p1].read_io_errs     130
[/dev/nvme0n1p1].flush_io_errs    0
[/dev/nvme0n1p1].corruption_errs  2
[/dev/nvme0n1p1].generation_errs  0
[/dev/nvme2n1p1].write_io_errs    354329
[/dev/nvme2n1p1].read_io_errs     339856
[/dev/nvme2n1p1].flush_io_errs    1334
[/dev/nvme2n1p1].corruption_errs  2806
[/dev/nvme2n1p1].generation_errs  0
root@Tower:~#

 

  

 

 

tower-diagnostics-20230815-2115.zip tower-smart-20230815-2130.zip tower-smart-20230815-2131.zip

P_20230815_194837.jpg

Edited by ars92
add memtest86 results

Solved by JorgeB

  • Community Expert
Aug 15 20:04:37 Tower kernel: critical medium error, dev nvme0n1
...
Aug 15 20:25:34 Tower kernel: critical medium error, dev nvme2n1

 

These are device errors, for both pool devices, so yes, they should be replaced, try to copy whatever you can and create a new pool.

  • Community Expert
  • Solution

Forgot to mention, nvme2n1 dropped offline at some point in the past, that's likely why the mirrored pool cannot recover everything, since unlikely that both devices have errors on the same sectors, but since that device was out of sync it cannot be used to recover the errors on the other one, see here for better pool monitoring for the future.

  • Author

Thanks JorgeB for the prompt reply. Sure enough, today the SSD has turned read only, at least I'm not able to start any of my VMs anymore. Managed to copy files out two days back from the VMs (since some of the Vdisks weren't able to be copied out in its entirety)  and the app data backup was already there due to CA Backup (thanks Squid for this!!)

 

Planning to get a pair of Crucial P5 Plus, since my two Evo Plus' 5 year warranty ended two months ago in June....lol

SN700 seems fun but way too expensive in my country for some reason.....

  • Author

Thanks JorgeB for the prompt reply. Sure enough, today the SSD has turned read only, at least I'm not able to start any of my VMs anymore. Managed to copy files out two days back from the VMs (since some of the Vdisks weren't able to be copied out in its entirety)  and the app data backup was already there due to CA Backup (thanks Squid for this!!)

 

Planning to get a pair of Crucial P5 Plus, since my two Evo Plus' 5 year warranty ended two months ago in June....lol

SN700 seems fun but way too expensive in my country for some reason.....


=========================================================================================================================

 

So I've gotten the replacement SSD, got the disk replaced without doing any reassigning etc. since the old drives have nothing useful anymore. Everything looks good, docker service is back up, but this is worrying me a bit. I have setup scrub and balance to run monthly now, just in case it helps in the future (I will setup the script suggested by JorgeB soon) but when I run "perform full balance" the page refreshes almost immediately (maybe due to nothing in the disks) but the recommendation doesn't go off. 

image.thumb.png.c2301f5c05b4286593fa3925b78dde2a.png

I then tried running the below CLI command and I get the below, but the GUI still shows the same message. Should I just ignore this?

image.png.870aef466dadfeab3bd2450fea46ece2.png

 

 

 

  • Community Expert
4 hours ago, ars92 said:

maybe due to nothing in the disks

Correct, it's normal.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

Account

Navigation

Search

Search

Configure browser push notifications

Chrome (Android)
  1. Tap the lock icon next to the address bar.
  2. Tap Permissions → Notifications.
  3. Adjust your preference.
Chrome (Desktop)
  1. Click the padlock icon in the address bar.
  2. Select Site settings.
  3. Find Notifications and adjust your preference.