Jump to content
thatnovaguy

Primary cache drive failing swap question

19 posts in this topic Last Reply

Recommended Posts

I recently bought a 1tb team group SSD to pair with the 1 tb mushkin SSD I had in my desktop as an upgrade to my old 240gb cache SSDs. I moved my cache data over and set the new drive as the primary and my old as the backup. I noticed that the temp read 0 on the new SSD but didn't think anything of it. However now I've gotten trim errors and my second SSD keeps getting hot (hopefully unrelated as I swapped cases and the SSDs are behind the mobo with no airflow).

 

Anywho what I'd like to know is 1. Am I correct in assuming the new SSD is going bunk? 2. Can I pull the new SSD and assign the old in the primary cache slot and continue with just the one while I RMA the new? And for extra credit what would be a reliable replacement. I'm looking to stay ~$100 or less.

 

Sent from my SM-N960U using Tapatalk

 

 

Share this post


Link to post
44 minutes ago, thatnovaguy said:

I recently bought a 1tb team group SSD to pair with the 1 tb mushkin SSD I had in my desktop as an upgrade to my old 240gb cache SSDs. I moved my cache data over and set the new drive as the primary and my old as the backup. I noticed that the temp read 0 on the new SSD but didn't think anything of it. However now I've gotten trim errors and my second SSD keeps getting hot (hopefully unrelated as I swapped cases and the SSDs are behind the mobo with no airflow).

 

Anywho what I'd like to know is 1. Am I correct in assuming the new SSD is going bunk? 2. Can I pull the new SSD and assign the old in the primary cache slot and continue with just the one while I RMA the new? And for extra credit what would be a reliable replacement. I'm looking to stay ~$100 or less.

1. Given it's a budget SSD brand, it could be many things.

  • Disk does not support SMART
  • Disk does not support TRIM
  • Disk is actually failing
  • etc.

Anyway, Tools -> Diagnostics -> attach in form post.

 

2. It depends on what you meant by "set the new drive as the primary and my old as the backup".

If it's BTRFS RAID1 then you should be fine (other than Unraid screaming errors at you while you wait for the RMA)

 

3. I would recommend NOT to use budget brand SSD.

 

4. With regards to the other SSD, how hot is hot? How frequently it gets hot?

Share this post


Link to post
1. Given it's a budget SSD brand, it could be many things.
  • Disk does not support SMART
  • Disk does not support TRIM
  • Disk is actually failing
  • etc.
Anyway, Tools -> Diagnostics -> attach in form post.
 
2. It depends on what you meant by "set the new drive as the primary and my old as the backup".
If it's BTRFS RAID1 then you should be fine (other than Unraid screaming errors at you while you wait for the RMA)
 
3. I would recommend NOT to use budget brand SSD.
 
4. With regards to the other SSD, how hot is hot? How frequently it gets hot?
It's in btrfs raid 1 so that's a plus. According to the notification the old drive is hitting 46° C about twice a day.

Sent from my SM-N960U using Tapatalk

Share this post


Link to post
2 minutes ago, thatnovaguy said:

According to the notification the old drive is hitting 46° C about twice a day.

That's nothing to worry about.

Share this post


Link to post
That's nothing to worry about.
Ok I had no idea what a decent threshold would be.

Sent from my SM-N960U using Tapatalk

Share this post


Link to post
46 minutes ago, testdasi said:

If it's BTRFS RAID1 then you should be fine (other than Unraid screaming errors at you while you wait for the RMA)

I missed the memo. Does unraid properly notify when the cache pool is degraded? Last I remember, it didn't.

 

Which leads to a followup question. Are you positive the RAID1 is operating properly?

Share this post


Link to post
1 minute ago, jonathanm said:

I missed the memo. Does unraid properly notify when the cache pool is degraded? Last I remember, it didn't.

 

Which leads to a followup question. Are you positive the RAID1 is operating properly?

When I last pulled my SSD from my (old) BTRFS RAID1 cache, I had repeated errors in syslog (can't remember the details, it was only for a few hours so I thought "meh, don't care" and the errors went away with the pulled SSD replaced). Maybe my RAID1 didn't work properly and I didn't realise. 😅

Share this post


Link to post
2 minutes ago, testdasi said:

When I last pulled my SSD from my (old) BTRFS RAID1 cache, I had repeated errors in syslog (can't remember the details, it was only for a few hours so I thought "meh, don't care" and the errors went away with the pulled SSD replaced). Maybe my RAID1 didn't work properly and I didn't realise. 😅

Ahh, syslog errors. Yes, I would definitely expect a boatload of syslog errors.

 

What I was referring to is the standard array notification system, with browser, email, and whatever other notifications are enabled.

 

I have seen several instances on the forum of RAID1 BTRFS issues where the second member wasn't actively participating, it only superficially appeared to be working properly.

Share this post


Link to post
3 minutes ago, jonathanm said:

What I was referring to is the standard array notification system, with browser, email, and whatever other notifications are enabled.

Depends on how it fails, if there's a missing device you'll get a notification, but if it drops offline temporarily most times you won't, hence why it's important to monitor any btrfs pool for errors.

Share this post


Link to post

Don't see any cache related errors, but it is using the wrong profile, only data is raid1, metada is single, i.e., if you lose one device pool is hosed, convert with:

btrfs balance start -mconvert=raid1 /mnt/cache

When done post output of:

btrfs fi usage -T /mnt/cache

 

Share this post


Link to post
Overall:
    Device size:                   1.82TiB
    Device allocated:            306.06GiB
    Device unallocated:            1.52TiB
    Device missing:                  0.00B
    Used:                        251.23GiB
    Free (estimated):            803.41GiB      (min: 803.41GiB)
    Data ratio:                       2.00
    Metadata ratio:                   2.00
    Global reserve:              166.30MiB      (used: 0.00B)

             Data      Metadata  System              
Id Path      RAID1     RAID1     RAID1    Unallocated
-- --------- --------- --------- -------- -----------
 2 /dev/sdg1 150.00GiB   3.00GiB 32.00MiB   778.48GiB
 1 /dev/sdi1 150.00GiB   3.00GiB 32.00MiB   778.48GiB
-- --------- --------- --------- -------- -----------
   Total     150.00GiB   3.00GiB 32.00MiB     1.52TiB
   Used      125.07GiB 555.23MiB 48.00KiB 

 

Share this post


Link to post

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.