Multiple disks failing (all Toshiba disks) and multiple disks with errors


abhi.ko

Recommended Posts

Each molex connector has a limit, and each connector at the PSU has a limit. Voltage is a finicky b*, and any sags and the drive can reset. 

 

Sata to molex isn't the most reliable, but should work in a pinch. Long term, my suggestion is to add proper connectors to additional lines. 3 planes (3 individual molex connectors) x 2 lines to the psu. Better yet, 2 planes x 3 lines would be optimal

 

 

Link to comment
  • Replies 71
  • Created
  • Last Reply

Top Posters In This Topic

Top Posters In This Topic

Posted Images

48 minutes ago, abhi.ko said:

Thanks - but I am a little confused, because this is all still on a single 12V line right, irrespective of what connector we plug it into?

Inside molex, there are two voltage, 12v and 5v, as wire and each contact point have resistance, V=I*R, so when "I" (current) increase then voltage drop in whole path will increase too, if all load in 1 wire will drop too much ( especially 5v ), and that's why we will use several connector and wire for all load.

Edited by Vr2Io
Link to comment

Thank you both for clarifying. So long term if I get a PSU that has 6 SATA/PERIF 6 pin outputs, something like this one, and setup a one backplane to one connector ratio, on the PSU that should be ideal right? 4 HDD's on one PSU connector.

 

Also if I use a SATA to molex cables to connect these single back planes to a SATA PSU connector would that be okay or do I need to get 6 molex to 6pin PSU connectors? 

 

Appreciate all the help guys. Learning things I did not know, so I appreciate it very much. 

Edited by abhi.ko
Link to comment

Yes, 1 connector for each backplane, spread across at least 2 lines back to the PSU (preferably 3). You can use your unused SATA power lines by adding your own molex connectors directly to the line using punch down style connectors like mentioned elsewhere in this thread: https://www.moddiy.com/products/DIY-IDE-Molex-Power-EZ-Crimp-Connector-%2d-Black.html

That's exactly what I did and it's been solid since.

Link to comment
On 2/8/2022 at 1:01 PM, JorgeB said:

Yep, check filesystem on both disabled disks, then if they mount look for a lost+found folder, if there's a lot of files there it's probably best to re-sync parity instead of rebuilding.

@JorgeB As instructed, I ran the xfs filesystem check with the -nv option on both my disabled disks (12 & 19), and both was not able to run came back with this:

Phase 1 - find and verify superblock...
bad primary superblock - bad CRC in superblock !!!

attempting to find secondary superblock...
.found candidate secondary superblock...
verified secondary superblock...
would write modified primary superblock
Primary superblock would have been modified.
Cannot proceed further in no_modify mode.
Exiting now.

What do you recommend? Should I even tryin repair or just resync parity? Could you please specify the steps for resyncing parity with 2 failed drives for me?

 

Thank you @Michael_P and @Vr2Io - I will distribute the molex connections better and report back. I am planning to use 1 backplane to 1 SATA/PERIF connector on the PSU.

Link to comment
2 hours ago, JorgeB said:

Run it again without -n or nothing will be done.

Phase 1 - find and verify superblock...
bad primary superblock - bad CRC in superblock !!!

attempting to find secondary superblock...
.found candidate secondary superblock...
verified secondary superblock...
writing modified primary superblock
sb root inode value 18446744073709551615 (NULLFSINO) inconsistent with calculated value 128
resetting superblock root inode pointer to 128
sb realtime bitmap inode value 18446744073709551615 (NULLFSINO) inconsistent with calculated value 129
resetting superblock realtime bitmap inode pointer to 129
sb realtime summary inode value 18446744073709551615 (NULLFSINO) inconsistent with calculated value 130
resetting superblock realtime summary inode pointer to 130
Phase 2 - using internal log
        - zero log...
ERROR: The filesystem has valuable metadata changes in a log which needs to
be replayed.  Mount the filesystem to replay the log, and unmount it before
re-running xfs_repair.  If you are unable to mount the filesystem, then use
the -L option to destroy the log and attempt a repair.
Note that destroying the log may cause corruption -- please attempt a mount
of the filesystem before doing this.

 

Got this now. The disks are unmountable as seen earlier. So should I run the check with -L and then attempt the repair or just ditch trying to repair and just resync parity to the same disk? 

Edited by abhi.ko
Link to comment
28 minutes ago, trurl said:

Since Unraid had already determined the drive was unmountable, you have no choice but to use -L. Also, you are currently trying to repair the filesystem of the emulated disk. Nothing will be done to the actual physical disk.

Thank you. Below is the output of the -L  option on Disk 12 (sdv)


Phase 1 - find and verify superblock...
sb root inode value 18446744073709551615 (NULLFSINO) inconsistent with calculated value 96
resetting superblock root inode pointer to 96
sb realtime bitmap inode value 18446744073709551615 (NULLFSINO) inconsistent with calculated value 97
resetting superblock realtime bitmap inode pointer to 97
sb realtime summary inode value 18446744073709551615 (NULLFSINO) inconsistent with calculated value 98
resetting superblock realtime summary inode pointer to 98
Phase 2 - using internal log
        - zero log...
ALERT: The filesystem has valuable metadata changes in a log which is being
destroyed because the -L option was used.
        - scan filesystem freespace and inode maps...
clearing needsrepair flag and regenerating metadata
sb_icount 0, counted 127424
sb_ifree 0, counted 48832
sb_fdblocks 1220420880, counted 466358182
        - found root inode chunk
Phase 3 - for each AG...
        - scan and clear agi unlinked lists...
        - process known inodes and perform inode discovery...
        - agno = 0
bad CRC for inode 128
bad CRC for inode 128, will rewrite
cleared inode 128
        - agno = 1
        - agno = 2
        - agno = 3
        - agno = 4
        - process newly discovered inodes...
Phase 4 - check for duplicate blocks...
        - setting up duplicate extent list...
        - check for inodes claiming duplicate blocks...
        - agno = 0
        - agno = 2
        - agno = 3
        - agno = 1
        - agno = 4
Phase 5 - rebuild AG headers and trees...
        - reset superblock...
Phase 6 - check inode connectivity...
        - resetting contents of realtime bitmap and summary inodes
        - traversing filesystem ...
        - traversal finished ...
        - moving disconnected inodes to lost+found ...
Phase 7 - verify and correct link counts...
Maximum metadata LSN (14:3179447) is ahead of log (1:2).
Format log to cycle 17.
done

Planning to run now from terminal -

xfs_repair -v /dev/md12

confirming that is the right next step? 

Link to comment
16 minutes ago, itimpi said:

Look like you have already run the repair.   Have you tried restarting in normal mode to see if the emulated drive now mounts? If it does look to see if there is a lost+found folder on the drive as that is where the repair process puts any content it could not correctly identify.

Oh that's right - I ran it without the -n just the -L - so repair is already done, you are right.

 

Let me do the same for disk 19 and then try and mount both drives - start array without maintenance mode. Will report soon.

 

Edited by abhi.ko
Link to comment

Phase 1 - find and verify superblock...
sb root inode value 18446744073709551615 (NULLFSINO) inconsistent with calculated value 128
resetting superblock root inode pointer to 128
sb realtime bitmap inode value 18446744073709551615 (NULLFSINO) inconsistent with calculated value 129
resetting superblock realtime bitmap inode pointer to 129
sb realtime summary inode value 18446744073709551615 (NULLFSINO) inconsistent with calculated value 130
resetting superblock realtime summary inode pointer to 130
Phase 2 - using internal log
        - zero log...
ALERT: The filesystem has valuable metadata changes in a log which is being
destroyed because the -L option was used.
        - scan filesystem freespace and inode maps...
clearing needsrepair flag and regenerating metadata
sb_icount 0, counted 100480
sb_ifree 0, counted 224
sb_fdblocks 1952984857, counted 1078863610
        - found root inode chunk
Phase 3 - for each AG...
        - scan and clear agi unlinked lists...
        - process known inodes and perform inode discovery...
        - agno = 0
bad CRC for inode 128
bad CRC for inode 128, will rewrite
cleared root inode 128
        - agno = 1
        - agno = 2
        - agno = 3
        - agno = 4
        - agno = 5
        - agno = 6
        - agno = 7
        - process newly discovered inodes...
Phase 4 - check for duplicate blocks...
        - setting up duplicate extent list...
        - check for inodes claiming duplicate blocks...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 4
        - agno = 3
        - agno = 6
        - agno = 5
        - agno = 7
Phase 5 - rebuild AG headers and trees...
        - reset superblock...
Phase 6 - check inode connectivity...
reinitializing root directory
        - resetting contents of realtime bitmap and summary inodes
        - traversing filesystem ...
        - traversal finished ...
        - moving disconnected inodes to lost+found ...
disconnected dir inode 131, moving to lost+found
disconnected dir inode 1171237, moving to lost+found
disconnected dir inode 4294967432, moving to lost+found
disconnected dir inode 6442451075, moving to lost+found
disconnected dir inode 6517809021, moving to lost+found
disconnected dir inode 10737418371, moving to lost+found
disconnected dir inode 15032385664, moving to lost+found
disconnected dir inode 15032385666, moving to lost+found
disconnected dir inode 15032385667, moving to lost+found
disconnected dir inode 15609373013, moving to lost+found
Phase 7 - verify and correct link counts...
resetting inode 98436057 nlinks from 2 to 12
Maximum metadata LSN (1:438928) is ahead of log (1:2).
Format log to cycle 4.
done

 

Disk 19 had lot more lost+found  moves.

Edited by abhi.ko
Link to comment
9 minutes ago, itimpi said:

Does its content look like it should?   The absence of that folder normally means the repair has been fully successful and all data is intact.

 

Sorry, I honestly don't have a before frame of reference disk-wise to compare against, to know if it is actually missing anything. I took it to mean the same (i.e. repair was successful). Is there a chance that there is a still a chance of corruption? 

7 minutes ago, itimpi said:

Does the contents of the lost+found have recognisable content or only items with cryptic names? 

Numbered folders within lost+found but recognizable sub folders within those.

2022-02-13 12_19_06-Tower_Browse.png

2022-02-13 12_19_29-Tower_Browse.png

Edited by abhi.ko
Link to comment
9 minutes ago, itimpi said:

You get entries in the lost+found folder when the repair process cannot find a directory entry to give the correct name.   Since the sub-folders have recognisable names you should be able to sort out the original path.  

Ok, so just clarify next steps:

 

For Disk 19: Do I just rename the numbered directories to the original share names? How do I change the disk from the current disabled status?

 

For Disk 12: what do I do?

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.