Jump to content

SATA cable loose. Restarted, SSD now has blue dot. Help.


ZipsServer

Recommended Posts

Hiya,

 

Quick question so I do not loose my data.

 

I have a three SSD cache pool and I started getting severe BTRFS error over the past 3 days. I was having trouble with one of my apps that heavily uses the cache pool so I restarted the system. To my horror the SSD was then missing and the array was running (Docker page had lots of errors). Thinking that one of the SATA cables came loose (or became bad?) I changed out the cables and restarted the system. Now the SSD has reappeared, but with a blue bot meaning that unraid thinks it is a new drive.

 

Should I use the "New Config" tool to get my array back where it was? What is the safest way to handle this? It's been awhile since I worked on my system...

 

Thanks!

 

EDIT:

Trying to mount the SSD with Unassigned Devices gives error

Apr 18 00:47:25 MasterTower unassigned.devices: Mount of '/dev/sdc1' failed. Error message: mount: /mnt/disks/SanDisk_SDSSDA240G_154836404031: wrong fs type, bad option, bad superblock on /dev/sdc1, missing codepage or helper program, or other error.

Could have become corrupted with bad cable? Safest to re-add the cache and let unraid rebuild the drive?

 

mastertower-diagnostics-20180418-0011.zip

Link to comment

Your cache is corrupt, besides the dropped device there are a lot of errors on another one:

 

Apr 17 23:59:56 MasterTower kernel: BTRFS info (device sdb1): bdev /dev/sdk1 errs: wr 15078549, rd 13910564, flush 60124, corrupt 0, gen 0

Recommend you backup your current pool, check connections/cables on all SSDs and recreate the pool, also upgrade to v6.5 since there are several cache pool improvements starting on v6.4.1

Link to comment
15 hours ago, johnnie.black said:

Thank you! Great post.

 

So before the entire cache pool went down, I was able to backup SSD 2 (I think).

 

Now, with your instructions, I was able to mount SSD 1 and SSD 3 and use rsync -a to copy the files to a cache pool backup on my array. I am also attempting to restore SSD 2 (because it wouldn't mount and I am not entirely confident in the previous backup).

 

My question is how does "btrfs restore" work and what file attributes does it save/copy? Can I use rsync -a to compare the /restore directory with my main cache pool backup? I would rather not lose the file attributes if I don't have to... (waiting for the restore to finish so I can do a dry run with rsync -avn)

 

EDIT:

I finished backing up SSD 2 and I had yet to reconcile the differences (if any) between my main cache backup and the restore directory for SSD 2. However, when I went to reassign the cache drives in preparation to create a new cache pool, I started the array and everything was back up and running "normally". Data seems to be there, Dockers were up and running, etc. I then tried to use rsync to copy from the cache pool to my cache pool backup, but it gave a bunch of I/O errors.

 

I am still noticing BTRFS errors on SSD 2 (not surprised). So I assume it is still best to wipe and reformat the cache pool?

 

EDIT 2:

Yep, so I cleared all of the SSDs, reformatted them a few times, and then copied the backup into the fresh cache pool. Everything seems to ok, but I haven't fully tested all of my apps yet. Hopefully there was no data loss.
Thanks!

Link to comment

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...