Jump to content

[SOLVED] XFS errors on parity emulated disk


Joseph

Recommended Posts

HELP!!

 

I have a disk that is--or at least I think that it is--failing. After 15 minutes or so (upon spin down?), the contents of the drive disappears. Forcing the drive to spin up didn't bring the contents back. However, restarting the array brought the contents back online, only for the issue to repeat. So I stopped the array and physically removed the drive. Currently, the contents of the disk are being emulated by the parity disks; however, after 15 mins or so, the contents of the emulated drive disappeared too...this happened at least once; I'm trying to replicate the issue. Stopping and restarting the array has brought the emulated contents back online

 

Additionally, the logs state there is an XFS error on the emulated disk and to XFS repair.

 

Quote

Jul 30 12:28:18 Tower kernel: XFS (md9): Corruption warning: Metadata has LSN (1:490844) ahead of current LSN (1:490797). Please unmount and run xfs_repair (>= v4.3) to resolve.
Jul 30 12:28:18 Tower kernel: XFS (md9): Metadata corruption detected at xfs_agf_read_verify+0xb0/0xba, xfs_agf block 0x15d508ec9
Jul 30 12:28:18 Tower kernel: XFS (md9): Unmount and run xfs_repair
Jul 30 12:28:18 Tower kernel: XFS (md9): First 64 bytes of corrupted metadata buffer:
Jul 30 12:28:18 Tower kernel: ffff8803ce9a2a00: 58 41 47 46 00 00 00 01 00 00 00 03 0e 8e 05 f0  XAGF............
Jul 30 12:28:18 Tower kernel: ffff8803ce9a2a10: 00 00 53 28 00 00 9e da 00 00 00 00 00 00 00 02  ..S(............
Jul 30 12:28:18 Tower kernel: ffff8803ce9a2a20: 00 00 00 02 00 00 00 00 00 00 00 1b 00 00 00 20  ............... 
Jul 30 12:28:18 Tower kernel: ffff8803ce9a2a30: 00 00 00 06 00 32 01 b2 00 01 00 44 00 00 00 06  .....2.....D....
Jul 30 12:28:18 Tower kernel: XFS (md9): metadata I/O error: block 0x15d508ec9 ("xfs_trans_read_buf_map") error 117 numblks 1
Jul 30 12:28:18 Tower kernel: XFS (md9): page discard on page ffffea000f067840, inode 0x180054ec0, offset 4096.

 

Can this (or should this) be done on emulated contents? What's the best way to go about rescuing the contents on the 'failed' drive?

 

Thanks for any assistance that is offered!!

 

CHEERS!

Link to comment

Thanks for your quick reply, johnnie.black.

 

Fwiw, the short smart test completed without error, but the drive made an occasional, odd clicking sound. I have the drive setup for a warranty replacement, which should be here by Wednesday; so I'll have to limp along on the emulated disk until then.

 

For fun, I put disk 10 in the disk 9 slot to see if it was an issue with the cable, controller, etc..... so far, so good.

 

 

Link to comment

UPDATE:

 

So I ran the Xfs repair on the emulated drive and it seems to have corrected some occupied free space along with a few text files of no real importance to me. I rebooted and started the array clean and see that there are some other issues:

 

Quote

Jul 30 14:48:30 Tower kernel: ata4.00: exception Emask 0x10 SAct 0x40000 SErr 0x400100 action 0x6 frozen
Jul 30 14:48:30 Tower kernel: ata4.00: irq_stat 0x08000000, interface fatal error
Jul 30 14:48:30 Tower kernel: ata4.00: failed command: WRITE FPDMA QUEUED
Jul 30 14:48:30 Tower kernel: ata4: hard resetting link

 

I've attached the diagnostics zip file..... THOUGHTS?

 

Link to comment
6 minutes ago, johnnie.black said:

Possibly a cable issue, the 850 EVOs are very picky about cables.

 

Based upon what I've been googling, I suspect it might be a power/data cable issue....weird because the box hasn't been moved since I added the second H310.

 

Is there a way to determine which disk is attached to ata4?

Link to comment

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...