November 18, 20241 yr hi guys i have a 14 disc array of mainly 2.5 ssd's. array running for years. last week noticed 1 disc with red cross (disc 8), ordered spares. during rebuild #1 i had errors on 1 disc (disc 12), like 300.000 - but data sync success then swapped disc #12 & rebuilt - success, array returned to normal but my directories are all gone in my mapped drive. if i see contents of a random disc, i do see contents there (directories like "music", 'audiobooks", "movies" etc) this a 30 tb array so i cant quickly swap all that data around on a temporary solution. how do i restore my original shares? there is one error in log: kernel: XFS (md8p1): metadata I/O error in "xfs_imap_to_bp+0x50/0x70 [xfs]" at daddr 0x37e3f730 len 32 error 117 thanks in advance to all your help steven Edited November 18, 20241 yr by steven_76
November 19, 20241 yr Community Expert Solution If there were read errors on another device during a rebuild, and with only single parity, the rebuilt disk will likely have some corruption, you should have stopped it. Check filesystem on disk8, run it without -n. Also seeing some ATA errors for multiple disks, you should investigate that.
November 19, 20241 yr Author 13 hours ago, JorgeB said: If there were read errors on another device during a rebuild, and with only single parity, the rebuilt disk will likely have some corruption, you should have stopped it. Check filesystem on disk8, run it without -n. Also seeing some ATA errors for multiple disks, you should investigate that. i owe you a beer 😀 the check on disc 8 took 1sec and i immediately solved my problem. seems i have no data loss, but share is online again and mapped, very happy with this. about ATA errors, which lines in config dump should i be looking for to ID those disc numbers? im gradually swapping out all spinners for SSD, as finances allow. i might be wrong but i take it those are less "wear & tear" sensitive
November 20, 20241 yr Community Expert 11 hours ago, steven_76 said: about ATA errors, which lines in config dump should i be looking for to ID those disc numbers? Look at the lsscsi.txt file in the diags, for example, there are issues with ata17: Nov 18 07:50:07 CUBE kernel: ata17.00: configured for UDMA/33 Nov 18 07:54:47 CUBE kernel: ata17: SATA link up 1.5 Gbps (SStatus 113 SControl 310) Nov 18 07:54:53 CUBE kernel: ata17.00: qc timeout after 5000 msecs (cmd 0xec) Nov 18 07:54:53 CUBE kernel: ata17.00: failed to IDENTIFY (I/O error, err_mask=0x4) Nov 18 07:54:53 CUBE kernel: ata17.00: revalidation failed (errno=-5) Nov 18 07:54:53 CUBE kernel: ata17: SATA link up 1.5 Gbps (SStatus 113 SControl 310) Nov 18 07:54:53 CUBE kernel: ata17.00: supports DRM functions and may not be fully accessible And in lscsi.txt you can see that ata17 is sdq: [10:0:0:0] disk ATA CT1000MX500SSD1 033 /dev/sdq /dev/sg16 state=running queue_depth=32 scsi_level=6 type=0 device_blocked=0 timeout=30 dir: /sys/bus/scsi/devices/10:0:0:0 [/sys/devices/pci0000:00/0000:00:15.0/0000:05:00.0/ata17/host10/target10:0:0/10:0:0:0] Start by replacing the cables for the affecte devices and monitor for new errors. 11 hours ago, steven_76 said: beer is no longer owed, cheers buddy Thanks!
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.