[SOLVED] 2 discs failed simultaneously ...


Recommended Posts

Hello,

 

using my unraid mostly for data storage and streaming I was puzzled about the fact that 2 drives fell into an unmountable status without former warnings.

The filesystem is XFS - so I tried to repair them without the -n option. That did not work, so I used the recomended -L option. The 2 drives are still not mountable but the content is still there provided by the parity discs.

 

I attached the diagnosis file  called "tower-diagnostics-20200810-1216.zip" , if you need something else .....

 

 

Please provide me with an easy to understand - repair using the parity discs protocol ...

 

cheers alex

 

tower-diagnostics-20200810-1216.zip

Link to comment
44 minutes ago, johnnie.black said:

Parity can't help with filesystem corruption, according to your diags discs 12 and 13 weren't mounting, but they are now, likely the result of running xfs_repair.

No they are still marked as unmountable - as there is no (or a corrupt) file system present. I wonder what the parity discs are for if not for repairing corrupt data ?

Link to comment

That's not what the diags show, initially:

 

Aug 10 01:10:27 Tower emhttpd: shcmd (85): mkdir -p /mnt/disk12
Aug 10 01:10:27 Tower emhttpd: shcmd (86): mount -t xfs -o noatime,nodiratime /dev/md12 /mnt/disk12
Aug 10 01:10:27 Tower kernel: XFS (md12): Metadata CRC error detected at xfs_sb_read_verify+0x114/0x15e [xfs], xfs_sb block 0xffffffffffffffff
Aug 10 01:10:27 Tower kernel: XFS (md12): Unmount and run xfs_repair
Aug 10 01:10:27 Tower kernel: XFS (md12): First 128 bytes of corrupted metadata buffer:
Aug 10 01:10:27 Tower kernel: 00000000d3e59167: 58 46 53 42 00 00 10 00 00 00 00 00 1d 1c 11 0e  XFSB............
Aug 10 01:10:27 Tower kernel: 00000000680eb2cf: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Aug 10 01:10:27 Tower kernel: 000000007cb90224: 29 d1 16 8f 8d 1c 42 6b b6 a7 42 d1 f4 88 c4 75  ).....Bk..B....u
Aug 10 01:10:27 Tower kernel: 00000000c3566992: 00 00 00 00 10 00 00 05 00 00 00 00 00 00 00 60  ...............`
Aug 10 01:10:27 Tower kernel: 00000000323821d9: 00 00 00 00 00 00 00 61 00 00 00 00 00 00 00 62  .......a.......b
Aug 10 01:10:27 Tower kernel: 00000000f1b45957: 00 00 00 01 07 47 04 44 00 00 00 04 00 00 00 00  .....G.D........
Aug 10 01:10:27 Tower kernel: 00000000d994568a: 00 03 a3 82 b4 b5 02 00 02 00 00 08 00 00 00 00  ................
Aug 10 01:10:27 Tower kernel: 0000000019056dcd: 00 00 00 00 00 00 00 00 0c 09 09 03 1b 00 00 05  ................
Aug 10 01:10:27 Tower kernel: XFS (md12): SB validate failed with error -74.
Aug 10 01:10:27 Tower root: mount: /mnt/disk12: mount(2) system call failed: Structure needs cleaning.
Aug 10 01:10:27 Tower emhttpd: shcmd (86): exit status: 32
Aug 10 01:10:27 Tower emhttpd: /mnt/disk12 mount error: No file system
Aug 10 01:10:27 Tower emhttpd: shcmd (87): umount /mnt/disk12
Aug 10 01:10:27 Tower root: umount: /mnt/disk12: not mounted.
Aug 10 01:10:27 Tower emhttpd: shcmd (87): exit status: 32
Aug 10 01:10:27 Tower emhttpd: shcmd (88): rmdir /mnt/disk12
Aug 10 01:10:27 Tower emhttpd: shcmd (89): mkdir -p /mnt/disk13
Aug 10 01:10:27 Tower emhttpd: shcmd (90): mount -t xfs -o noatime,nodiratime /dev/md13 /mnt/disk13
Aug 10 01:10:27 Tower root: mount: /mnt/disk13: mount(2) system call failed: Structure needs cleaning.
Aug 10 01:10:27 Tower kernel: XFS (md13): Metadata CRC error detected at xfs_sb_read_verify+0x114/0x15e [xfs], xfs_sb block 0xffffffffffffffff
Aug 10 01:10:27 Tower kernel: XFS (md13): Unmount and run xfs_repair
Aug 10 01:10:27 Tower kernel: XFS (md13): First 128 bytes of corrupted metadata buffer:
Aug 10 01:10:27 Tower kernel: 000000006f8c05c4: 58 46 53 42 00 00 10 00 00 00 00 00 1d 1c 11 0e  XFSB............
Aug 10 01:10:27 Tower kernel: 00000000ce15d76b: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Aug 10 01:10:27 Tower kernel: 00000000204e19e6: 84 8f 09 65 2b b2 41 fc 85 8c e1 ba fb 86 15 3f  ...e+.A........?
Aug 10 01:10:27 Tower kernel: 00000000ef363133: 00 00 00 00 10 00 00 05 00 00 00 00 00 00 00 60  ...............`
Aug 10 01:10:27 Tower kernel: 000000008480b48b: 00 00 00 00 00 00 00 61 00 00 00 00 00 00 00 62  .......a.......b
Aug 10 01:10:27 Tower kernel: 000000004876be3a: 00 00 00 01 07 47 04 44 00 00 00 04 00 00 00 00  .....G.D........
Aug 10 01:10:27 Tower kernel: 000000002d34a656: 00 03 a3 82 b4 b5 02 00 02 00 00 08 00 00 00 00  ................
Aug 10 01:10:27 Tower kernel: 00000000188e404a: 00 00 00 00 00 00 00 00 0c 09 09 03 1b 00 00 05  ................
Aug 10 01:10:27 Tower kernel: XFS (md13): SB validate failed with error -74.
Aug 10 01:10:27 Tower emhttpd: shcmd (90): exit status: 32
Aug 10 01:10:27 Tower emhttpd: /mnt/disk13 mount error: No file system
Aug 10 01:10:27 Tower emhttpd: shcmd (91): umount /mnt/disk13
Aug 10 01:10:27 Tower root: umount: /mnt/disk13: not mounted.

 

Then:

Aug 10 01:36:43 Tower emhttpd: shcmd (430): mkdir -p /mnt/disk12
Aug 10 01:36:43 Tower emhttpd: shcmd (431): mount -t xfs -o noatime,nodiratime /dev/md12 /mnt/disk12
Aug 10 01:36:43 Tower kernel: XFS (md12): Mounting V5 Filesystem
Aug 10 01:36:44 Tower kernel: XFS (md12): Starting recovery (logdev: internal)
Aug 10 01:36:44 Tower kernel: XFS (md12): Ending recovery (logdev: internal)
Aug 10 01:36:44 Tower emhttpd: shcmd (432): xfs_growfs /mnt/disk12
Aug 10 01:36:44 Tower root: meta-data=/dev/md12              isize=512    agcount=4, agsize=122094660 blks
Aug 10 01:36:44 Tower root:          =                       sectsz=512   attr=2, projid32bit=1
Aug 10 01:36:44 Tower root:          =                       crc=1        finobt=1, sparse=0, rmapbt=0
Aug 10 01:36:44 Tower root:          =                       reflink=0
Aug 10 01:36:44 Tower root: data     =                       bsize=4096   blocks=488378638, imaxpct=5
Aug 10 01:36:44 Tower root:          =                       sunit=0      swidth=0 blks
Aug 10 01:36:44 Tower root: naming   =version 2              bsize=4096   ascii-ci=0, ftype=1
Aug 10 01:36:44 Tower root: log      =internal log           bsize=4096   blocks=238466, version=2
Aug 10 01:36:44 Tower root:          =                       sectsz=512   sunit=0 blks, lazy-count=1
Aug 10 01:36:44 Tower root: realtime =none                   extsz=4096   blocks=0, rtextents=0
Aug 10 01:36:44 Tower emhttpd: shcmd (433): mkdir -p /mnt/disk13
Aug 10 01:36:44 Tower emhttpd: shcmd (434): mount -t xfs -o noatime,nodiratime /dev/md13 /mnt/disk13
Aug 10 01:36:44 Tower kernel: XFS (md13): Mounting V5 Filesystem
Aug 10 01:36:44 Tower kernel: XFS (md13): Starting recovery (logdev: internal)
Aug 10 01:36:44 Tower kernel: XFS (md13): Ending recovery (logdev: internal)
Aug 10 01:36:44 Tower emhttpd: shcmd (435): xfs_growfs /mnt/disk13
Aug 10 01:36:44 Tower root: meta-data=/dev/md13              isize=512    agcount=4, agsize=122094660 blks
Aug 10 01:36:44 Tower root:          =                       sectsz=512   attr=2, projid32bit=1
Aug 10 01:36:44 Tower root:          =                       crc=1        finobt=1, sparse=0, rmapbt=0
Aug 10 01:36:44 Tower root:          =                       reflink=0
Aug 10 01:36:44 Tower root: data     =                       bsize=4096   blocks=488378638, imaxpct=5
Aug 10 01:36:44 Tower root:          =                       sunit=0      swidth=0 blks
Aug 10 01:36:44 Tower root: naming   =version 2              bsize=4096   ascii-ci=0, ftype=1
Aug 10 01:36:44 Tower root: log      =internal log           bsize=4096   blocks=238466, version=2
Aug 10 01:36:44 Tower root:          =                       sectsz=512   sunit=0 blks, lazy-count=1
Aug 10 01:36:44 Tower root: realtime =none                   extsz=4096   blocks=0, rtextents=0

 

Please post new diags after array start.

 

Link to comment

Okay 

23 hours ago, johnnie.black said:

That's not what the diags show, initially:

 


Aug 10 01:10:27 Tower emhttpd: shcmd (85): mkdir -p /mnt/disk12
Aug 10 01:10:27 Tower emhttpd: shcmd (86): mount -t xfs -o noatime,nodiratime /dev/md12 /mnt/disk12
Aug 10 01:10:27 Tower kernel: XFS (md12): Metadata CRC error detected at xfs_sb_read_verify+0x114/0x15e [xfs], xfs_sb block 0xffffffffffffffff
Aug 10 01:10:27 Tower kernel: XFS (md12): Unmount and run xfs_repair
Aug 10 01:10:27 Tower kernel: XFS (md12): First 128 bytes of corrupted metadata buffer:
Aug 10 01:10:27 Tower kernel: 00000000d3e59167: 58 46 53 42 00 00 10 00 00 00 00 00 1d 1c 11 0e  XFSB............
Aug 10 01:10:27 Tower kernel: 00000000680eb2cf: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Aug 10 01:10:27 Tower kernel: 000000007cb90224: 29 d1 16 8f 8d 1c 42 6b b6 a7 42 d1 f4 88 c4 75  ).....Bk..B....u
Aug 10 01:10:27 Tower kernel: 00000000c3566992: 00 00 00 00 10 00 00 05 00 00 00 00 00 00 00 60  ...............`
Aug 10 01:10:27 Tower kernel: 00000000323821d9: 00 00 00 00 00 00 00 61 00 00 00 00 00 00 00 62  .......a.......b
Aug 10 01:10:27 Tower kernel: 00000000f1b45957: 00 00 00 01 07 47 04 44 00 00 00 04 00 00 00 00  .....G.D........
Aug 10 01:10:27 Tower kernel: 00000000d994568a: 00 03 a3 82 b4 b5 02 00 02 00 00 08 00 00 00 00  ................
Aug 10 01:10:27 Tower kernel: 0000000019056dcd: 00 00 00 00 00 00 00 00 0c 09 09 03 1b 00 00 05  ................
Aug 10 01:10:27 Tower kernel: XFS (md12): SB validate failed with error -74.
Aug 10 01:10:27 Tower root: mount: /mnt/disk12: mount(2) system call failed: Structure needs cleaning.
Aug 10 01:10:27 Tower emhttpd: shcmd (86): exit status: 32
Aug 10 01:10:27 Tower emhttpd: /mnt/disk12 mount error: No file system
Aug 10 01:10:27 Tower emhttpd: shcmd (87): umount /mnt/disk12
Aug 10 01:10:27 Tower root: umount: /mnt/disk12: not mounted.
Aug 10 01:10:27 Tower emhttpd: shcmd (87): exit status: 32
Aug 10 01:10:27 Tower emhttpd: shcmd (88): rmdir /mnt/disk12
Aug 10 01:10:27 Tower emhttpd: shcmd (89): mkdir -p /mnt/disk13
Aug 10 01:10:27 Tower emhttpd: shcmd (90): mount -t xfs -o noatime,nodiratime /dev/md13 /mnt/disk13
Aug 10 01:10:27 Tower root: mount: /mnt/disk13: mount(2) system call failed: Structure needs cleaning.
Aug 10 01:10:27 Tower kernel: XFS (md13): Metadata CRC error detected at xfs_sb_read_verify+0x114/0x15e [xfs], xfs_sb block 0xffffffffffffffff
Aug 10 01:10:27 Tower kernel: XFS (md13): Unmount and run xfs_repair
Aug 10 01:10:27 Tower kernel: XFS (md13): First 128 bytes of corrupted metadata buffer:
Aug 10 01:10:27 Tower kernel: 000000006f8c05c4: 58 46 53 42 00 00 10 00 00 00 00 00 1d 1c 11 0e  XFSB............
Aug 10 01:10:27 Tower kernel: 00000000ce15d76b: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Aug 10 01:10:27 Tower kernel: 00000000204e19e6: 84 8f 09 65 2b b2 41 fc 85 8c e1 ba fb 86 15 3f  ...e+.A........?
Aug 10 01:10:27 Tower kernel: 00000000ef363133: 00 00 00 00 10 00 00 05 00 00 00 00 00 00 00 60  ...............`
Aug 10 01:10:27 Tower kernel: 000000008480b48b: 00 00 00 00 00 00 00 61 00 00 00 00 00 00 00 62  .......a.......b
Aug 10 01:10:27 Tower kernel: 000000004876be3a: 00 00 00 01 07 47 04 44 00 00 00 04 00 00 00 00  .....G.D........
Aug 10 01:10:27 Tower kernel: 000000002d34a656: 00 03 a3 82 b4 b5 02 00 02 00 00 08 00 00 00 00  ................
Aug 10 01:10:27 Tower kernel: 00000000188e404a: 00 00 00 00 00 00 00 00 0c 09 09 03 1b 00 00 05  ................
Aug 10 01:10:27 Tower kernel: XFS (md13): SB validate failed with error -74.
Aug 10 01:10:27 Tower emhttpd: shcmd (90): exit status: 32
Aug 10 01:10:27 Tower emhttpd: /mnt/disk13 mount error: No file system
Aug 10 01:10:27 Tower emhttpd: shcmd (91): umount /mnt/disk13
Aug 10 01:10:27 Tower root: umount: /mnt/disk13: not mounted.

 

Then:


Aug 10 01:36:43 Tower emhttpd: shcmd (430): mkdir -p /mnt/disk12
Aug 10 01:36:43 Tower emhttpd: shcmd (431): mount -t xfs -o noatime,nodiratime /dev/md12 /mnt/disk12
Aug 10 01:36:43 Tower kernel: XFS (md12): Mounting V5 Filesystem
Aug 10 01:36:44 Tower kernel: XFS (md12): Starting recovery (logdev: internal)
Aug 10 01:36:44 Tower kernel: XFS (md12): Ending recovery (logdev: internal)
Aug 10 01:36:44 Tower emhttpd: shcmd (432): xfs_growfs /mnt/disk12
Aug 10 01:36:44 Tower root: meta-data=/dev/md12              isize=512    agcount=4, agsize=122094660 blks
Aug 10 01:36:44 Tower root:          =                       sectsz=512   attr=2, projid32bit=1
Aug 10 01:36:44 Tower root:          =                       crc=1        finobt=1, sparse=0, rmapbt=0
Aug 10 01:36:44 Tower root:          =                       reflink=0
Aug 10 01:36:44 Tower root: data     =                       bsize=4096   blocks=488378638, imaxpct=5
Aug 10 01:36:44 Tower root:          =                       sunit=0      swidth=0 blks
Aug 10 01:36:44 Tower root: naming   =version 2              bsize=4096   ascii-ci=0, ftype=1
Aug 10 01:36:44 Tower root: log      =internal log           bsize=4096   blocks=238466, version=2
Aug 10 01:36:44 Tower root:          =                       sectsz=512   sunit=0 blks, lazy-count=1
Aug 10 01:36:44 Tower root: realtime =none                   extsz=4096   blocks=0, rtextents=0
Aug 10 01:36:44 Tower emhttpd: shcmd (433): mkdir -p /mnt/disk13
Aug 10 01:36:44 Tower emhttpd: shcmd (434): mount -t xfs -o noatime,nodiratime /dev/md13 /mnt/disk13
Aug 10 01:36:44 Tower kernel: XFS (md13): Mounting V5 Filesystem
Aug 10 01:36:44 Tower kernel: XFS (md13): Starting recovery (logdev: internal)
Aug 10 01:36:44 Tower kernel: XFS (md13): Ending recovery (logdev: internal)
Aug 10 01:36:44 Tower emhttpd: shcmd (435): xfs_growfs /mnt/disk13
Aug 10 01:36:44 Tower root: meta-data=/dev/md13              isize=512    agcount=4, agsize=122094660 blks
Aug 10 01:36:44 Tower root:          =                       sectsz=512   attr=2, projid32bit=1
Aug 10 01:36:44 Tower root:          =                       crc=1        finobt=1, sparse=0, rmapbt=0
Aug 10 01:36:44 Tower root:          =                       reflink=0
Aug 10 01:36:44 Tower root: data     =                       bsize=4096   blocks=488378638, imaxpct=5
Aug 10 01:36:44 Tower root:          =                       sunit=0      swidth=0 blks
Aug 10 01:36:44 Tower root: naming   =version 2              bsize=4096   ascii-ci=0, ftype=1
Aug 10 01:36:44 Tower root: log      =internal log           bsize=4096   blocks=238466, version=2
Aug 10 01:36:44 Tower root:          =                       sectsz=512   sunit=0 blks, lazy-count=1
Aug 10 01:36:44 Tower root: realtime =none                   extsz=4096   blocks=0, rtextents=0

 

Please post new diags after array start.

 

here we are sir ;) .....

new tower-diagnostics-20200811-1320.zip

Link to comment
23 hours ago, johnnie.black said:

Also note that both disks will still need to be rebuilt, but only if the emulated disks are mounting and showing the correct data.

hmmm - how can I find out if the data are still correct ?

And how can I start the rebuilt process ?

I thought the parity drives allow to replace complete failed/defective disks ....

Link to comment
18 minutes ago, ahab666 said:

here we are sir ;) .....

Disks are mounting:

Filesystem      Size  Used Avail Use% Mounted on
/dev/md12       1.9T  1.7T  221G  89% /mnt/disk12
/dev/md13       1.9T  1.3T  539G  72% /mnt/disk13

 

15 minutes ago, ahab666 said:

hmmm - how can I find out if the data are still correct ?

It should be, but with xfs you'd need previously created checksums to be sure, check is used space looks correct and if there's no lost+found folder all should be fine.

 

16 minutes ago, ahab666 said:

And how can I start the rebuilt process ?

https://wiki.unraid.net/Troubleshooting#Re-enable_the_drive

Link to comment

Thank you so much - the last link to the wiki enabled me to start the rebuild - re-enable process ...

fiy - the weirdest thing though - I was able to see the files in dolphin and even on my win pc (emulated probably) via Samba and NFS.

 

Now my most stupid Question - how can i tag the thread as solved (=edit the topic line ???) ?

Edited by ahab666
Link to comment
  • JorgeB changed the title to [SOLVED] 2 discs failed simultaneously ...

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.