Jump to content
Sign in to follow this  
B1scu1T

Cache Drives Unmountable after v6.3 Upgrade

9 posts in this topic Last Reply

Recommended Posts

Hi All,

 

Hopefully someone can help me our here, here is a summary of the timeline:

As far as I know, everything was fine this morning. I used some of my docker apps so I have no reason to presume otherwise.

My new 4TB HDD arrived, so I plugged it in and then opened up the webGUI

Saw that there was a load of plug in updates to do, so I ran (inc. 6.2.x (sorry, I dont remember exactly) to 6.3.2 OS update) them and rebooted.

After the reboot, I set off the preclear on my new disk.

Switched back to the "Main" page and noticed my cache drive 1 is now unmountable.

Switch back to the preclear and stop it, so that I can analyse the cache problem first.

Check the "appdata" via SMB share and all my docker files are missing

Switch the two cache drives in the GUI (1 becomes 2, and 2 becomes 1) and now the other physical cache drive is unmountable

Try starting the array with just one cache drive, each time the same result... they are unmountable.

 

As  far as I'm aware, they were always set up as btrfs but to be honest its not something I have paid huge attention too... ive never had too!

Essentially, I didnt change any settings that should affect this part of the system, so I cant think of any reason this would happen?

 

I have attached the diags.

 

tower-diagnostics-20170329-1354.zip

Share this post


Link to post

Something went wrong with the pool, it's reporting a negative number on devices missing, don't think I ever saw that before:

 

Mar 29 13:52:09 Tower emhttp: cacheNumDevices: 2
Mar 29 13:52:09 Tower emhttp: cacheTotDevices: 1
Mar 29 13:52:09 Tower emhttp: cacheNumMissing: -1

You can try disconnecting one of the SSDs at the time (it needs to be physically disconnect from the server) and see if it mounts with the other one alone, if not use these steps to try and recover your data:

 

 

Share this post


Link to post

 

 

31 minutes ago, johnnie.black said:

Something went wrong with the pool, it's reporting a negative number on devices missing, don't think I ever saw that before:

 


Mar 29 13:52:09 Tower emhttp: cacheNumDevices: 2
Mar 29 13:52:09 Tower emhttp: cacheTotDevices: 1
Mar 29 13:52:09 Tower emhttp: cacheNumMissing: -1

 

Eek! Thats really not the kind of thing I wanted to hear! :(

 

31 minutes ago, johnnie.black said:

 

You can try disconnecting one of the SSDs at the time (it needs to be physically disconnect from the server) and see if it mounts with the other one alone, if not use these steps to try and recover your data:

 

New timeline:

(~1502.zip) Shudown Unplugged the Sandisk SSD and booted up - files are missing but drive does mount

(~1511.zip) Shutdown and switched round so the Sandisk is in and Kingston is out, but I also unplugged my new Toshiba HDD - All files are back as expected

(~1526.zip) Plug in Toshiba drive while running and reboot server - Status the same as previous

 

I have left it like this for now as everything seems to be running, but it would be good to get the Kingston back into the Cache pool safely, should I just follow the standard instructions or is there anything that I should check first?

tower-diagnostics-20170329-1502.zip

tower-diagnostics-20170329-1511.zip

tower-diagnostics-20170329-1526.zip

Share this post


Link to post

With the array stopped wipe out the Kingston:

 

blkdiscard /dev/sdX

 

Replace X with the correct letter.

 

Re-add the Kingston to the pool, start array, balance will begin, don't stop array until it finishes, i.e., when there's no cache pool read/write activity.

Edited by johnnie.black

Share this post


Link to post
9 minutes ago, johnnie.black said:

With the array stopped wipe out the Kingston:

 


blkdiscard /dev/sdX

 

Replace X with the correct letter.

 

Re-add the Kingston to the pool, start array, balance will begin, don't stop array until it finishes, i.e., when there's no cache pool read/write activity.

Apologies for my ignorance, but is the command to be entered directly from the console, or is this something I can do via web access... I cant see a terminal app built in as standard :$

Share this post


Link to post

Ok there seems to be I/O activity on the cache so it looks like things are working.

 

Is it safe to also run a pre-clear at the same time or is that asking for trouble?

Share this post


Link to post

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Sign in to follow this