Cache SSD drive unmountable / missing FS


Recommended Posts

Hi guys,

 

This morning I couldn't reach the Internet, so I started to check my PiHole DNS docker.

When logging in to the Unraid server, the VM and Docker were not running (Service couldn't start)

 

The Cache drive was not mounted, so I stopped the array and started it again. Still not able to mount.

Then reading through a lot of logs etc just to see, that it might have been due to a unclean reboot.

 

Hours later, reading up on various blog posts around the Internet, I pretty much gave up, since nothing really helped, and didn't want to make it worse (which may be too late)

 

Anyways, it's my SSD Cache drive marked as /dev/sdf that is giving me issues.

It contains the system app data, but also the docker image and libvert image, along with some other backup stuff.

 

Ideally I would like to be able to recover some files, and have ordered an USB-> SATA converter, just in case I need to hook it up to my main pc and try and recover files that way.

 

But I wanted to reach out here also, just in case someone could help me on this.

 

fsck result:

root@Tower:~# fsck /dev/sdf
fsck from util-linux 2.36
e2fsck 1.45.6 (20-Mar-2020)
/sbin/e2fsck: Input/output error while trying to open /dev/sdf

The superblock could not be read or does not describe a valid ext2/ext3/ext4
filesystem.  If the device is valid and it really contains an ext2/ext3/ext4
filesystem (and not swap or ufs or something else), then the superblock
is corrupt, and you might try running e2fsck with an alternate superblock:
    e2fsck -b 8193 <device>
 or
    e2fsck -b 32768 <device>

 

Smartctl:

root@Tower:~# smartctl -a /dev/sdf
smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.10.1-Unraid] (local build)
Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Phison Driven SSDs
Device Model:     KINGSTON SA400S37960G
Serial Number:    50026B778326A54D
LU WWN Device Id: 5 0026b7 78326a54d
Firmware Version: SBFK61K1
User Capacity:    960,197,124,096 bytes [960 GB]
Sector Size:      512 bytes logical/physical
Rotation Rate:    Solid State Device
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-3 T13/2161-D revision 4
SATA Version is:  SATA 3.2, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Tue Mar  2 13:51:56 2021 CET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
                                        was never started.
                                        Auto Offline Data Collection: Disabled.
Self-test execution status:      ( 112) The previous self-test completed having
                                        the read element of the test failed.
Total time to complete Offline 
data collection:                (65535) seconds.
Offline data collection
capabilities:                    (0x11) SMART execute Offline immediate.
                                        No Auto Offline data collection support.
                                        Suspend Offline collection upon new
                                        command.
                                        No Offline surface scan supported.
                                        Self-test supported.
                                        No Conveyance Self-test supported.
                                        No Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine 
recommended polling time:        (   2) minutes.
Extended self-test routine
recommended polling time:        (  30) minutes.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x0032   100   100   000    Old_age   Always       -       1460
  9 Power_On_Hours          0x0032   100   100   000    Old_age   Always       -       13332
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       15
148 Unknown_Attribute       0x0000   100   100   000    Old_age   Offline      -       21
149 Unknown_Attribute       0x0000   100   100   000    Old_age   Offline      -       3363
167 Write_Protect_Mode      0x0000   100   100   000    Old_age   Offline      -       0
168 SATA_Phy_Error_Count    0x0012   100   100   000    Old_age   Always       -       0
169 Bad_Block_Rate          0x0000   100   100   000    Old_age   Offline      -       14
170 Bad_Blk_Ct_Erl/Lat      0x0000   100   100   010    Old_age   Offline      -       0/34
172 Erase_Fail_Count        0x0032   100   100   000    Old_age   Always       -       0
173 MaxAvgErase_Ct          0x0000   100   100   000    Old_age   Offline      -       86 (Average 23)
181 Program_Fail_Count      0x0032   100   100   000    Old_age   Always       -       0
182 Erase_Fail_Count        0x0000   100   100   000    Old_age   Offline      -       0
187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always       -       2
192 Unsafe_Shutdown_Count   0x0012   100   100   000    Old_age   Always       -       13
194 Temperature_Celsius     0x0022   032   041   000    Old_age   Always       -       32 (Min/Max 23/41)
196 Reallocated_Event_Count 0x0032   100   100   000    Old_age   Always       -       2
199 SATA_CRC_Error_Count    0x0032   100   100   000    Old_age   Always       -       0
218 CRC_Error_Count         0x0032   100   100   000    Old_age   Always       -       0
231 SSD_Life_Left           0x0000   097   097   000    Old_age   Offline      -       97
233 Flash_Writes_GiB        0x0032   100   100   000    Old_age   Always       -       22000
241 Lifetime_Writes_GiB     0x0032   100   100   000    Old_age   Always       -       28075
242 Lifetime_Reads_GiB      0x0032   100   100   000    Old_age   Always       -       6193
244 Average_Erase_Count     0x0000   100   100   000    Old_age   Offline      -       23
245 Max_Erase_Count         0x0000   100   100   000    Old_age   Offline      -       86
246 Total_Erase_Count       0x0000   100   100   000    Old_age   Offline      -       2040736

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed: read failure       00%     13330         0

Selective Self-tests/Logging not supported

 

Guides I've been looking at;

-

 

tower-diagnostics-20210302-1343.zip

Link to comment
55 minutes ago, JorgeB said:

You might not able to repair the filesystem on a failing drive, you can try cloning it with ddrescue, then repair the filesystem, but note the ddrescue is not flash device optimized.

Thanks for the input.

 

It's running now...

Couldn't get to work with unassigned devices, so had to mount another drive...

Crossing fingers in about 12 hours when it's done :)

Edited by CODEG33K
Link to comment

So, with the help of ddrescue, I was able to get a iso of some sort, and afaik this ISO contains the entire disk, and the files itself.

 

Now I'm struggling to get that ISO file mounted, so I can extract data from it.

 

I used this command to get the ISO compiled;

ddrescue -d -f -r3 /dev/sdf /mnt/disks/Testdrive/Drive/cache.iso /mnt/disks/Testdrive/Log/ddrescue.log

 

The ISO sits in a unassigned disc with xfs filesystem.

 

I can't seem to get an fs type from the ISO file created, and I'm a bit lost as to what to do next.

 

I've pulled the damaged SSD (Cache) drive, just to be sure nothing else happens to it.

 

Where do I go from here?

 

 

Link to comment

I tried that to begin with, only to have zero files to look at.
Then found the correct command to use, to get an ISO saved...should maybe have been *.img instead?

 

But do you know how I mount an iso that contains the entire cache disk information?

 

Also I did receive that USB -> SATA converter, so that I as a last resort could use some rescue software to at least get my docker.img and VM image back...the rest can be gone - I don't mind...

Edited by CODEG33K
Link to comment
24 minutes ago, SimonF said:

Can you mount the iso with unassigned devices

I can mount the unassigned device, where I have 2 folders; 

 

* Drive

Holds the ISO file

 

* Log

Holds the map file from ddrescue

 

Edit:

Just realized what you asked, and tried to add the ISO file as a share in the unassigned devices.

Just to see this result in the log;

Mar 3 20:59:02 Tower unassigned.devices: Mount of '/mnt/disks/Testdrive/Drive/cache.iso' failed. Error message: mount: /mnt/disks/cache: wrong fs type, bad option, bad superblock on /dev/loop2, missing codepage or helper program, or other error.

 

Edited by CODEG33K
update
Link to comment
Just now, SimonF said:

There is a iso mount option in UD under shares you may need to toggle slider to see

Mar 3 20:59:01 Tower unassigned.devices: Mount iso command: mount -ro loop '/mnt/disks/Testdrive/Drive/cache.iso' '/mnt/disks/cache'
Mar 3 20:59:02 Tower unassigned.devices: Mount of '/mnt/disks/Testdrive/Drive/cache.iso' failed. Error message: mount: /mnt/disks/cache: wrong fs type, bad option, bad superblock on /dev/loop2, missing codepage or helper program, or other error.

 

Link to comment
59 minutes ago, JorgeB said:

That suggest the source is heavily damaged, probably doesn't matter if you then clone to an ISO.

I think that was because of the wrong dd commands.
Haven't tried it with the flags that I also used for the creation of the ISO later on.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.