Superblock failure

February 12, 20188 yr

Have a new issue with my unraid server.

I was away for a while so I powered down my server. On powering it up Disk9 (sdl) was unmountable and the shares weren't available.

I ran a disk check on the missing disk and got the following;

Phase 1 - find and verify superblock...
superblock read failed, offset 562633482240, size 131072, ag 9, rval -1

fatal error -- Input/output error

Reading around the forum it seems the suggested fix for this is to create a new config but I'd like to just double check this before I move any further.

Diagnostics attached below.

Many thanks in advance.

tower-diagnostics-20180212-1409.zip

Quote

February 12, 20188 yr

Community Expert

There's what looks like a problem with the disk, it dropped offline and was disabled, so there's no SMART report, reboot and post new diags

Quote

February 12, 20188 yr

Author

rebooted and attached.

tower-diagnostics-20180212-1457.zip

Quote

February 12, 20188 yr

Community Expert

I was wrong, only looked at the file size, Disk9 has SMART disable, you need to enable it first:

smartctl -s on /dev/sdl

Then grab and post new diags.

Disk4 is failing, this will make the rebuild of disk9 challenging, do you have notifications enable?

Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   199   199   051    Pre-fail  Always       -       49109
  5 Reallocated_Sector_Ct   0x0033   057   057   140    Pre-fail  Always   FAILING_NOW 1143
 96 Reallocated_Event_Count 0x0032   001   001   000    Old_age   Always       -       627
197 Current_Pending_Sector  0x0032   199   192   000    Old_age   Always       -       185
198 Offline_Uncorrectable   0x0030   200   198   000    Old_age   Offline      -       1
200 Multi_Zone_Error_Rate   0x0008   190   001   000    Old_age   Offline      -       2135

Disk8 needs a new SATA cable:

Quote

February 12, 20188 yr

Author

and new diagnostics.

Just out of interest, how do you need 8 needs a new SATA cable? Its in the same cage as 9 which is really inaccessible.

tower-diagnostics-20180212-1528.zip

Quote

February 12, 20188 yr

Community Expert

14 minutes ago, cybrey said:

and new diagnostics.

SMART is still disable.

10 minutes ago, cybrey said:

Just out of interest, how do you need 8 needs a new SATA cable?

These:

Feb 12 14:49:30 Tower kernel: ata9.00: exception Emask 0x10 SAct 0x0 SErr 0x400000 action 0x6 frozen
Feb 12 14:49:30 Tower kernel: ata9.00: irq_stat 0x08000000, interface fatal error
Feb 12 14:49:30 Tower kernel: ata9: SError: { Handshk }
Feb 12 14:49:30 Tower kernel: ata9.00: failed command: WRITE DMA
Feb 12 14:49:30 Tower kernel: ata9.00: cmd ca/00:10:d0:d2:00/00:00:00:00:00/e0 tag 20 dma 8192 out
Feb 12 14:49:30 Tower kernel:         res 50/00:00:df:d2:00/00:00:00:00:00/e0 Emask 0x10 (ATA bus error)
Feb 12 14:49:30 Tower kernel: ata9.00: status: { DRDY }
Feb 12 14:49:30 Tower kernel: ata9: hard resetting link
Feb 12 14:49:30 Tower kernel: ata9: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
Feb 12 14:49:30 Tower kernel: ata9.00: configured for UDMA/133
Feb 12 14:49:30 Tower kernel: ata9: EH complete

plus this:

199 UDMA_CRC_Error_Count    0x0032   200   198   000    Old_age   Always       -       295

Makes it very likely it's a cable/connection issue, but keep monitoring that attribute, if it increases there's a problem.

Quote

February 12, 20188 yr

Author

Really strange.. the command worked;

image.png.648f70bb00c9b3777be3c347695b9a89.png

and the GUI confirms it's enabled;

image.png.48bdeda35449185d841b4ddb11daf1b0.png

I've powered the box off and on, and attached new diagnostics.

tower-diagnostics-20180212-1554.zip

Quote

February 12, 20188 yr

Community Expert

hmm, maybe the diags don't work because this disk is disabled, try this instead:

smartctl -s on /dev/sdl

Then:

smartctl -a /dev/sdl

and post the output

Quote

February 12, 20188 yr

Author

Apologies, several screenshots. Couldn't think of an easy way of getting a file off the machine;

Quote

February 12, 20188 yr

Community Expert

Disk looks fine, there's one CRC error, not a big deal but might indicate the cable is a problem and reason why it got disabled.

The problem is that since you only have one parity disk and disk4 is failing the rebuild will most likely result on some (or a lot) of corrupted files, and you should't rebuild on top of the old disk, only on a new disk, alternatively and if no new files are on that disk since it got disabled the best way forward would be a new config without disk4 (or a new one in its place), then copying everything you can from old disk4.

Quote

February 12, 20188 yr

Author

I'm a little confused as to why Disk9 is disabled yet the issues are appearing on Disk4?

The machine has been off since this issue appeared and I've not been able to access the shares, so no new files should have been created.

Disk 4 is pretty small so I'm happy to lose it completely. Whats are the steps for pulling disk4 out of the array and then attempting to recover data from it?

Quote

February 12, 20188 yr

Community Expert

3 minutes ago, cybrey said:

I'm a little confused as to why Disk9 is disabled yet the issues are appearing on Disk4?

Two separate and unrelated issues.

3 minutes ago, cybrey said:

Whats are the steps for pulling disk4 out of the array and then attempting to recover data from it?

-Tools -> New Config -> Retain current configuration: All -> Apply
-assign any missing disk(s)

-unassign disk4 (you can leave slot 4 empty or assign one of the other disks to it)
-start array to begin parity sync

Disk9 will likely mount, but if still unmountable don't format, wait for the sync to finish and run xfs_repair in the end.

When done use the UD plugin to mount disk4 and copy everything you can to the array.

Quote

February 12, 20188 yr

Author

Disk9 mounted without any issues and the parity is rebuilding.

Many thanks as ever for all your help, greatly appreciated.

Wish I could work out why I've been having so many issues recently.

Quote

February 12, 20188 yr

Community Expert

Except for disk4 which is really failing most of your issues appear to be cable related, recommend you update to v6.4.1, make sure notifications are enable, an you'll be warned about CRC errors, usually a sign of a bad cable, acknowledge any existing values since this attribute never resets, and if it increases for any disk there's still a problem, likely the SATA cable but it can also be the backplane, controller port or in very rare cases the disk itself.

Quote

February 13, 20188 yr

Author

Raid parity rebuilt successfully and I've updated to the latest version. However the UD plugin is stuck at this;

image.png.897d4f8a53e8759b64e065175161e80e.png

Edited February 13, 20188 yr by cybrey

Quote

February 13, 20188 yr

Community Expert

Try rebooting and make sure you have the latest plugin version, it still the same post in the UD support thread, don't forget to post your diags.

Quote

February 14, 20188 yr

Author

Just completed reboot, still in the same state. Diagnostics attached.

tower-diagnostics-20180214-0933.zip

Edited February 14, 20188 yr by cybrey

Quote

February 14, 20188 yr

Community Expert

You need to post on the UD plugin support thread, with the diagnostcis.

P.S. there are still ATA errors on disk8 and CRC error count increased from 295 to 300, likely a bad SATA cable:

199 UDMA_CRC_Error_Count    -O--CK   200   198   000    -    300

Quote

February 14, 20188 yr

Author

ok, will post on the UD plugin support page and get that cable swapped out.

Quote

Superblock failure

Featured Replies

Archived

Account

Navigation

Search

Configure browser push notifications

Chrome (Android)

Chrome (Desktop)

Safari (iOS 16.4+)

Safari (macOS)

Edge (Android)

Edge (Desktop)

Firefox (Android)

Firefox (Desktop)