Multiple disks failing (all Toshiba disks) and multiple disks with errors

February 8, 20224 yr

Community Expert

Each molex connector has a limit, and each connector at the PSU has a limit. Voltage is a finicky b*, and any sags and the drive can reset.

Sata to molex isn't the most reliable, but should work in a pinch. Long term, my suggestion is to add proper connectors to additional lines. 3 planes (3 individual molex connectors) x 2 lines to the psu. Better yet, 2 planes x 3 lines would be optimal

Quote

February 8, 20224 yr

48 minutes ago, abhi.ko said:

Thanks - but I am a little confused, because this is all still on a single 12V line right, irrespective of what connector we plug it into?

Inside molex, there are two voltage, 12v and 5v, as wire and each contact point have resistance, V=I*R, so when "I" (current) increase then voltage drop in whole path will increase too, if all load in 1 wire will drop too much ( especially 5v ), and that's why we will use several connector and wire for all load.

Edited February 9, 20224 yr by Vr2Io

Quote

February 9, 20224 yr

Author
Community Expert

Thank you both for clarifying. So long term if I get a PSU that has 6 SATA/PERIF 6 pin outputs, something like this one, and setup a one backplane to one connector ratio, on the PSU that should be ideal right? 4 HDD's on one PSU connector.

Also if I use a SATA to molex cables to connect these single back planes to a SATA PSU connector would that be okay or do I need to get 6 molex to 6pin PSU connectors?

Appreciate all the help guys. Learning things I did not know, so I appreciate it very much.

Edited February 9, 20224 yr by abhi.ko

Quote

February 9, 20224 yr

Community Expert

Yes, 1 connector for each backplane, spread across at least 2 lines back to the PSU (preferably 3). You can use your unused SATA power lines by adding your own molex connectors directly to the line using punch down style connectors like mentioned elsewhere in this thread: https://www.moddiy.com/products/DIY-IDE-Molex-Power-EZ-Crimp-Connector-%2d-Black.html

That's exactly what I did and it's been solid since.

Quote

February 11, 20224 yr

Author
Community Expert

thanks again. I do not plan to do the DIY connectors - is using the SATA to Molex adapters not a good substitute? Where would I get additional molex connector cables for the PSU from would any cable work with any PSU or do I have to buy the same PSU manufacturer.

Quote

February 11, 20224 yr

1 hour ago, abhi.ko said:

is using the SATA to Molex adapters not a good substitute?

Yes, but still better then previous problem topology, those DIY really easy but need some electronic skill.

Quote

February 12, 20224 yr

Author
Community Expert

On 2/8/2022 at 1:01 PM, JorgeB said:

Yep, check filesystem on both disabled disks, then if they mount look for a lost+found folder, if there's a lot of files there it's probably best to re-sync parity instead of rebuilding.

@JorgeB As instructed, I ran the xfs filesystem check with the -nv option on both my disabled disks (12 & 19), and both was not able to run came back with this:

Phase 1 - find and verify superblock...
bad primary superblock - bad CRC in superblock !!!

attempting to find secondary superblock...
.found candidate secondary superblock...
verified secondary superblock...
would write modified primary superblock
Primary superblock would have been modified.
Cannot proceed further in no_modify mode.
Exiting now.

What do you recommend? Should I even tryin repair or just resync parity? Could you please specify the steps for resyncing parity with 2 failed drives for me?

Thank you @Michael_P and @Vr2Io - I will distribute the molex connections better and report back. I am planning to use 1 backplane to 1 SATA/PERIF connector on the PSU.

Quote

February 13, 20224 yr

Community Expert

12 hours ago, abhi.ko said:

What do you recommend?

Run it again without -n or nothing will be done.

Quote

February 13, 20224 yr

Author
Community Expert

2 hours ago, JorgeB said:

Run it again without -n or nothing will be done.

Phase 1 - find and verify superblock...
bad primary superblock - bad CRC in superblock !!!

attempting to find secondary superblock...
.found candidate secondary superblock...
verified secondary superblock...
writing modified primary superblock
sb root inode value 18446744073709551615 (NULLFSINO) inconsistent with calculated value 128
resetting superblock root inode pointer to 128
sb realtime bitmap inode value 18446744073709551615 (NULLFSINO) inconsistent with calculated value 129
resetting superblock realtime bitmap inode pointer to 129
sb realtime summary inode value 18446744073709551615 (NULLFSINO) inconsistent with calculated value 130
resetting superblock realtime summary inode pointer to 130
Phase 2 - using internal log
        - zero log...
ERROR: The filesystem has valuable metadata changes in a log which needs to
be replayed.  Mount the filesystem to replay the log, and unmount it before
re-running xfs_repair.  If you are unable to mount the filesystem, then use
the -L option to destroy the log and attempt a repair.
Note that destroying the log may cause corruption -- please attempt a mount
of the filesystem before doing this.

Got this now. The disks are unmountable as seen earlier. So should I run the check with -L and then attempt the repair or just ditch trying to repair and just resync parity to the same disk?

Edited February 13, 20224 yr by abhi.ko

Quote

February 13, 20224 yr

Community Expert

1 hour ago, abhi.ko said:

run the check with -L and then attempt the repair

This.

You haven't actually done any repair yet to see what those results would be for the emulated disks.

Quote

February 13, 20224 yr

Author
Community Expert

13 minutes ago, trurl said:

This.

You haven't actually done any repair yet to see what those results would be for the emulated disks.

Thank You. Will do now. Was worried about the comment "this might cause corruption"

Quote

February 13, 20224 yr

Community Expert

12 minutes ago, abhi.ko said:

Was worried

Since Unraid had already determined the drive was unmountable, you have no choice but to use -L. Also, you are currently trying to repair the filesystem of the emulated disk. Nothing will be done to the actual physical disk.

Quote

February 13, 20224 yr

Author
Community Expert

28 minutes ago, trurl said:

Since Unraid had already determined the drive was unmountable, you have no choice but to use -L. Also, you are currently trying to repair the filesystem of the emulated disk. Nothing will be done to the actual physical disk.

Thank you. Below is the output of the -L option on Disk 12 (sdv)


Phase 1 - find and verify superblock...
sb root inode value 18446744073709551615 (NULLFSINO) inconsistent with calculated value 96
resetting superblock root inode pointer to 96
sb realtime bitmap inode value 18446744073709551615 (NULLFSINO) inconsistent with calculated value 97
resetting superblock realtime bitmap inode pointer to 97
sb realtime summary inode value 18446744073709551615 (NULLFSINO) inconsistent with calculated value 98
resetting superblock realtime summary inode pointer to 98
Phase 2 - using internal log
        - zero log...
ALERT: The filesystem has valuable metadata changes in a log which is being
destroyed because the -L option was used.
        - scan filesystem freespace and inode maps...
clearing needsrepair flag and regenerating metadata
sb_icount 0, counted 127424
sb_ifree 0, counted 48832
sb_fdblocks 1220420880, counted 466358182
        - found root inode chunk
Phase 3 - for each AG...
        - scan and clear agi unlinked lists...
        - process known inodes and perform inode discovery...
        - agno = 0
bad CRC for inode 128
bad CRC for inode 128, will rewrite
cleared inode 128
        - agno = 1
        - agno = 2
        - agno = 3
        - agno = 4
        - process newly discovered inodes...
Phase 4 - check for duplicate blocks...
        - setting up duplicate extent list...
        - check for inodes claiming duplicate blocks...
        - agno = 0
        - agno = 2
        - agno = 3
        - agno = 1
        - agno = 4
Phase 5 - rebuild AG headers and trees...
        - reset superblock...
Phase 6 - check inode connectivity...
        - resetting contents of realtime bitmap and summary inodes
        - traversing filesystem ...
        - traversal finished ...
        - moving disconnected inodes to lost+found ...
Phase 7 - verify and correct link counts...
Maximum metadata LSN (14:3179447) is ahead of log (1:2).
Format log to cycle 17.
done

Planning to run now from terminal -

xfs_repair -v /dev/md12

confirming that is the right next step?

Quote

February 13, 20224 yr

Community Expert

Look like you have already run the repair. Have you tried restarting in normal mode to see if the emulated drive now mounts? If it does look to see if there is a lost+found folder on the drive as that is where the repair process puts any content it could not correctly identify.

Quote

February 13, 20224 yr

Author
Community Expert

16 minutes ago, itimpi said:

Look like you have already run the repair. Have you tried restarting in normal mode to see if the emulated drive now mounts? If it does look to see if there is a lost+found folder on the drive as that is where the repair process puts any content it could not correctly identify.

Oh that's right - I ran it without the -n just the -L - so repair is already done, you are right.

Let me do the same for disk 19 and then try and mount both drives - start array without maintenance mode. Will report soon.

Edited February 13, 20224 yr by abhi.ko

Quote

February 13, 20224 yr

Author
Community Expert


Phase 1 - find and verify superblock...
sb root inode value 18446744073709551615 (NULLFSINO) inconsistent with calculated value 128
resetting superblock root inode pointer to 128
sb realtime bitmap inode value 18446744073709551615 (NULLFSINO) inconsistent with calculated value 129
resetting superblock realtime bitmap inode pointer to 129
sb realtime summary inode value 18446744073709551615 (NULLFSINO) inconsistent with calculated value 130
resetting superblock realtime summary inode pointer to 130
Phase 2 - using internal log
        - zero log...
ALERT: The filesystem has valuable metadata changes in a log which is being
destroyed because the -L option was used.
        - scan filesystem freespace and inode maps...
clearing needsrepair flag and regenerating metadata
sb_icount 0, counted 100480
sb_ifree 0, counted 224
sb_fdblocks 1952984857, counted 1078863610
        - found root inode chunk
Phase 3 - for each AG...
        - scan and clear agi unlinked lists...
        - process known inodes and perform inode discovery...
        - agno = 0
bad CRC for inode 128
bad CRC for inode 128, will rewrite
cleared root inode 128
        - agno = 1
        - agno = 2
        - agno = 3
        - agno = 4
        - agno = 5
        - agno = 6
        - agno = 7
        - process newly discovered inodes...
Phase 4 - check for duplicate blocks...
        - setting up duplicate extent list...
        - check for inodes claiming duplicate blocks...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 4
        - agno = 3
        - agno = 6
        - agno = 5
        - agno = 7
Phase 5 - rebuild AG headers and trees...
        - reset superblock...
Phase 6 - check inode connectivity...
reinitializing root directory
        - resetting contents of realtime bitmap and summary inodes
        - traversing filesystem ...
        - traversal finished ...
        - moving disconnected inodes to lost+found ...
disconnected dir inode 131, moving to lost+found
disconnected dir inode 1171237, moving to lost+found
disconnected dir inode 4294967432, moving to lost+found
disconnected dir inode 6442451075, moving to lost+found
disconnected dir inode 6517809021, moving to lost+found
disconnected dir inode 10737418371, moving to lost+found
disconnected dir inode 15032385664, moving to lost+found
disconnected dir inode 15032385666, moving to lost+found
disconnected dir inode 15032385667, moving to lost+found
disconnected dir inode 15609373013, moving to lost+found
Phase 7 - verify and correct link counts...
resetting inode 98436057 nlinks from 2 to 12
Maximum metadata LSN (1:438928) is ahead of log (1:2).
Format log to cycle 4.
done

Disk 19 had lot more lost+found moves.

Edited February 13, 20224 yr by abhi.ko

Quote

February 13, 20224 yr

Author
Community Expert

Both mounted fine - still disabled and emulated.

Disk 12 has no lost+found

Disk 19 only has lost+found

Diagnostics & screenshots below.

What now please? Thanks for all the help till now.

tower-diagnostics-20220213-1131.zip

Edited February 13, 20224 yr by abhi.ko
Added diag file

Quote

February 13, 20224 yr

Community Expert

36 minutes ago, abhi.ko said:

Disk 12 has no lost+found

Do you have a spare you can use to rebuild disk12?

Quote

February 13, 20224 yr

Author
Community Expert

2 minutes ago, trurl said:

Do you have a spare you can use to rebuild disk12?

Not a precleared one. But I do have a 10TB disk that is not in the case or array.

Quote

February 13, 20224 yr

Community Expert

41 minutes ago, abhi.ko said:

Disk 12 has no lost+found

Does its content look like it should? The absence of that folder normally means the repair has been fully successful and all data is intact.

Quote

February 13, 20224 yr

Community Expert

42 minutes ago, abhi.ko said:

Disk 19 only has lost+found

Does the contents of the lost+found have recognisable content or only items with cryptic names?

Quote

February 13, 20224 yr

Author
Community Expert

9 minutes ago, itimpi said:

Does its content look like it should? The absence of that folder normally means the repair has been fully successful and all data is intact.

Sorry, I honestly don't have a before frame of reference disk-wise to compare against, to know if it is actually missing anything. I took it to mean the same (i.e. repair was successful). Is there a chance that there is a still a chance of corruption?

7 minutes ago, itimpi said:

Does the contents of the lost+found have recognisable content or only items with cryptic names?

Numbered folders within lost+found but recognizable sub folders within those.

Edited February 13, 20224 yr by abhi.ko

Quote

February 13, 20224 yr

Community Expert

You get entries in the lost+found folder when the repair process cannot find a directory entry to give the correct name. Since the sub-folders have recognisable names you should be able to sort out the original path.

Quote

February 13, 20224 yr

Author
Community Expert

9 minutes ago, itimpi said:

You get entries in the lost+found folder when the repair process cannot find a directory entry to give the correct name. Since the sub-folders have recognisable names you should be able to sort out the original path.

Ok, so just clarify next steps:

For Disk 19: Do I just rename the numbered directories to the original share names? How do I change the disk from the current disabled status?

For Disk 12: what do I do?

Quote

February 13, 20224 yr

Community Expert

Just reviewed the thread. Did you ever get your hardware problems resolved? Probably better to not attempt any rebuild if you are still having power problems.

Quote

Multiple disks failing (all Toshiba disks) and multiple disks with errors

Featured Replies

Top Posters In This Topic

Popular Days

Posted Images

Join the conversation

Top Posters In This Topic

Popular Days

Posted Images

Account

Navigation

Search

Configure browser push notifications

Chrome (Android)

Chrome (Desktop)

Safari (iOS 16.4+)

Safari (macOS)

Edge (Android)

Edge (Desktop)

Firefox (Android)

Firefox (Desktop)