btrfs pool read/write errors (cache)


Recommended Posts

Recommended next steps?
The sde1 drive of the server cache is has write, read, and corruption errors. The HD drive is the secondary drive in a cache pool with an SSD. I've followed the instructions in the FAQ for monitoring btrfs pool. Here are the steps I've taken:

 

  • Check btrfs stats
    • confirmed errors
  • Check cable connections (same cables)
  • Run btrfs scrub (clicking on cache from main)

 

While scrub is running I see that there are "uncorrectable errors" on sde1. The drive in question also fails a SMART test. What would you recommend as next steps? is it time to replace the drive and cable?

 

Background: These errors started happening after replacing a smaller SSD in the cache pool. sde1 was not touched.

 

1656129828_ScreenShot2020-08-20at9_22_10AM.thumb.png.12687760ad317f48fdb9fbb6cdfe560b.png765793241_ScreenShot2020-08-20at9_21_43AM.thumb.png.3eed818763785b49e2a921ac20176f98.png

 

 

bluebox-smart-20200820-0925.zip bluebox-diagnostics-20200820-0924.zip

Link to comment
1 hour ago, kingJahfy said:

The drive in question also fails a SMART test. What would you recommend as next steps? is it time to replace the drive

Any drive that fails SMART test should be replaced.

 

1 hour ago, kingJahfy said:

The HD drive is the secondary drive in a cache pool with an SSD.

Why would you do this? Why do you think you need 2TB for cache anyway?

 

There are also some things about how you have dockers / VMs configured and using your shares that I would change. Why do you have 40G docker.img? 20G should be more than enough unless you have something misconfigured.

 

Personally I would go for replacing SSD cache and not even have another in the pool unless it was another SSD. Since you apparently don't care about redundancy in cache might as well just go with a single SSD.

 

Since cache appears to be readable and not a lot of used space currently, you might just move it all to the array, create a new SSD only cache with new SSD(s), and then work on getting those things moved back to cache that belong there. Currently your domain and system shares have files on the array anyway so that could use some work as well as recreating docker.img at a more reasonable size.

 

Link to comment

Current pool is a mess with multiple profiles:

 

Overall:
    Device size:           1.82TiB
    Device allocated:         388.28GiB
    Device unallocated:           1.44TiB
    Device missing:             0.00B
    Used:             365.34GiB
    Free (estimated):         746.54GiB    (min: 744.60GiB)
    Data ratio:                  1.99
    Metadata ratio:              1.80
    Global reserve:         170.03MiB    (used: 0.00B)

             Data    Data      Data    Metadata  Metadata  Metadata System   System   System              
Id Path      single  RAID1     DUP     single    RAID1     DUP      single   RAID1    DUP      Unallocated
-- --------- ------- --------- ------- --------- --------- -------- -------- -------- -------- -----------
 1 /dev/sdd1 1.00GiB 188.00GiB 2.00GiB   1.00GiB   3.00GiB  2.00GiB 32.00MiB 96.00MiB 64.00MiB   734.33GiB
 2 /dev/sde1       - 188.00GiB       -         -   3.00GiB        -        - 96.00MiB        -   740.42GiB
-- --------- ------- --------- ------- --------- --------- -------- -------- -------- -------- -----------
   Total     1.00GiB 188.00GiB 1.00GiB   1.00GiB   3.00GiB  1.00GiB 32.00MiB 96.00MiB 32.00MiB     1.44TiB
   Used      1.50MiB 182.29GiB   0.00B 512.00KiB 387.03MiB    0.00B    0.00B 48.00KiB    0.00B 

      

 

You should backup and recreate with the single SSD.

Link to comment
6 hours ago, trurl said:

Why would you do this? Why do you think you need 2TB for cache anyway?

I had an extra 1TB hdd drive laying around that matched the size of the 1TB ssd that I recently purchased and thought it would be better to make a pool for redundancy than using a single drive.

As indicated at https://wiki.unraid.net/UnRAID_Manual_6#Creating_a_Cache_Pool - "Pooling multiple storage devices together ensures that data protection is maintained at all times, whether data is in the cache or the array"

I bought a 1TB ssd because the previou 250GB ssd was regularly over 75% usage and there was a discount.

Link to comment

Pairing HDD with SSD sort of defeats the purpose of having SSD since it will only work at HDD speed. And if redundancy is what you had in mind you apparently didn't accomplish it as shown by johnnie.black. I assumed you were just combining capacity so no redundancy since that is what your screenshot showed.

 

You also need to consider getting those ReiserFS disks converted.

 

 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.