Cache Drive Unformatted After Reboot


Recommended Posts

Hi all, I've been using unRaid for a few years now with great success.  I am currently on unRaid 6.2.4.  Last night I was trying to download a video and I noticed my Dockers were all stopped.  I thought that was suspicious so I did a reboot and carried on with my evening.  This afternoon my wife said she couldnt get Kodi to work on our Raspberry Pi and I checked and the cache drive is listed as Unassigned and the only option I had was a reformat.  Unfortunately I did a reboot again before I realized I had a real problem.  So I got the diagnostics.  Now when I look at the main screen it shows the disk as size 0 (zero) and the button says "Insert" and is greyed out.  Now i reboot again and the size and format option is back. 

 

I tried the cache recovery options but anything I try to do with /dev/sdg1 says it doesn't exist. 

 

The only other thing I can find was on google from a post back in 2009 that caused the same problem when the cache disk got full.  This is definitely possible as i pretty much just let this thing run unattended and something could have gotten out of hand. 

 

Most everything important on the cache drive was backed up but I would rather do a recovery if possible as I don't remember how to set everything back up and will have to spend a few nights going through that again.  Like I said, its been a couple of years.  Any help you can provide would be appreciated. 

tower-diagnostics-20170712-1952.zip

Link to comment
15 hours ago, johnnie.black said:

Cache disk is full of pending sectors and needs to be replaced.

Ok, thanks for the response.  Can you tell me what that means and how  you determined that?

 

Is there any way to copy the data off? I have a spare waiting to go just for such an eventuality, but I really dont want to have to redo all of my dockers. 

Edited by leodavinci
Link to comment
14 minutes ago, leodavinci said:

Can you tell me what that means and how  you determined that?

 

Looking at the SMART report, I assume you don't have notifications enable or you'd get warnings.

 

197 Current_Pending_Sector  0x0012   193   193   000    Old_age   Always       -       620

 

17 minutes ago, leodavinci said:

Is there any way to copy the data off?

 

I's say chances are very low due to the high number of pending sectors, you'd need to do a clone of that disk with dd skipping the bad sectors and then run reiserfsck on it, even if it works there will probably be some (or a lot) of corrupt files.

Link to comment

I do have notifications enabled for once a week and the last one was July 10.  Below is the text.  Is this something that would happen all of a sudden? Is there something that would have caused that?  If it can be OK on one notification and be failing that quickly should i increase my notification frequency?

 

Event: unRAID Status
Subject: Notice [TOWER] - array health report [PASS]
Description: Array has 10 disks (including parity & cache)
Importance: normal

Parity - ST4000DM000-1F2168_S300HCGM (sdj) - standby [OK]
Disk 1 - WDC_WD20EZRX-19D8PB0_WD-WMC4M1000956 (sdd) - standby [OK]
Disk 2 - WDC_WD20EZRX-19D8PB0_WD-WCC4M0355112 (sde) - standby [OK]
Disk 3 - WDC_WD20EARS-00S8B1_WD-WCAVY3778723 (sdf) - active 30 C [OK]
Disk 4 - Hitachi_HDS722020ALA330_JK1101B9H9UY9T (sdc) - standby [OK]
Disk 5 - ST4000DM000-1F2168_Z30266QR (sdk) - active 25 C [OK]
Disk 6 - ST4000DM000-1F2168_Z3035YBR (sdi) - standby [OK]
Disk 7 - ST4000DM000-1F2168_S300NL63 (sdm) - standby [OK]
Disk 8 - ST2000DM001-9YN164_W1E1J0V1 (sdg) - standby [OK]
Cache - WDC_WD5000AAKS-00TMA0_WD-WCAPW0936270 (sdh) - active 31 C [OK]

Parity is valid
Last checked on Mon 03 Jul 2017 03:27:28 PM EDT (7 days ago), finding 0 errors.
Duration: 13 hours, 27 minutes, 27 seconds. Average speed: 82.6 MB/s

 

Link to comment

That is array a status notification, to get SMART warnings you need to enable warnings and alerts notifications and make sure SMART attribute notifications are also enable, with these you'll get a notification the instant there is a attribute change, and it can happen at any time without any previous warning.

Link to comment

Well, thats irritating.  I guess i assumed that since the array health was ok, that everything was chugging along fine.  So, how do I enable SMART warnings? 

 

I have all of the notifications under "Notification Settings" set to Browser and Email, I am getting the array health emails.  Under "Disk Settings->Global SMART Settings" I have the following.  What should I change to get these notifications?

 

Default SMART notification value:
Raw
Default SMART notification tolerance level:
Absolute
Default SMART controller type:
Automatic
 
 
checked - 5 Reallocated sectors count
 
checked - 187Reported uncorrectable errors
 
unchecked - 188Command time-out
 
checked - 197Current pending sector count
 
checked - 198Uncorrectable sector count
Edited by leodavinci
Link to comment
6 hours ago, leodavinci said:

I have all of the notifications under "Notification Settings" set to Browser and Email, I am getting the array health emails.  Under "Disk Settings->Global SMART Settings" I have the following.  What should I change to get these notifications?

 

Looks like they are all enable, not sure why you didn't get the pending sector notifications then, if you go to the dashboard page do you see the SMART warnings?

Link to comment

I am getting pending sector warnings, but I haven't gotten any warnings for that drive.  I got warnings for a removable drive i am using as a rotating off site backup for all the important stuff.  It looks like this:

 

Event: unRAID device sdb SMART health [198]
Subject: Warning [TOWER] - offline uncorrectable is 6424
Description: ST4000LM016-1N2170_W801PWBP (sdb)
Importance: warning

That looks bad, I am going to check it out and maybe return it for a warranty if i can. 

 

I got one for an array drive that says:

 

Event: unRAID Disk 3 SMART health [198]
Subject: Warning [TOWER] - offline uncorrectable is 2
Description: WDC_WD20EARS-00S8B1_WD-WCAVY3778723 (sde)
Importance: warning

I am doing an extended smart test on it.

 

Nothing about that cache drive though.  I guess it just failed out of the blue.  That sucks. 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.