  1. good news. Server is now running fine. Parity sync is now finished. I'll run preclear on the disks used to see if they really failing. Tank your @JorgeB for your help 😃
  2. that's it. The sata connector was almost unplugged!!! Mar 8 13:40:10 godzilla kernel: ata2.00: exception Emask 0x50 SAct 0x0 SErr 0x280900 action 0x6 frozen Mar 8 13:40:10 godzilla kernel: ata2.00: irq_stat 0x08000000, interface fatal error Mar 8 13:40:10 godzilla kernel: ata2: SError: { UnrecovData HostInt 10B8B BadCRC } Mar 8 13:40:10 godzilla kernel: ata2.00: failed command: READ DMA EXT Mar 8 13:40:10 godzilla kernel: ata2.00: cmd 25/00:00:08:1b:b8/00:01:36:00:00/e0 tag 28 dma 131072 in Mar 8 13:40:10 godzilla kernel: res 50/00:00:8f:20:b8/00:00:36:00:00/e0 Emask 0x50 (AT
  3. i have that multiple times in syslog. It keeps coming up. How can i know which controller/port is involved? Mar 8 12:27:32 godzilla kernel: ata2: SError: { HostInt 10B8B LinkSeq } Mar 8 12:27:32 godzilla kernel: ata2.00: failed command: WRITE DMA EXT Mar 8 12:27:32 godzilla kernel: ata2.00: cmd 35/00:60:f8:c3:4d/00:00:1a:00:00/e0 tag 30 dma 49152 out Mar 8 12:27:32 godzilla kernel: res 50/00:00:17:85:4d/00:00:1a:00:00/e0 Emask 0x50 (ATA bus error) Mar 8 12:27:32 godzilla kernel: ata2.00: status: { DRDY } Mar 8 12:27:32 godzilla kernel: ata2: hard resetting link Mar 8 12:27:3
  4. thank you 😃 parity sync is started; but it's very slow: 20MB/S edit My bad docker service started and that was slowing down the process. now it runs at 150MB/S. Now i also have udma crc errors on cache pools (both vm and docker pool)🤥 The controller on the mainboard could also have a problem.
  5. i have UDMA CRC error count at 15 on disk4
  6. i'm confused. I replaced sata cable, changed sata port and still I/O error. So i changed enclosure for disk 4; and still I/O error when doing check _n with gui. So in a terminal i ran xfs_repair -n on disk4 in the external enclosure and it's running but with gui i have i/o error. I have good hope for disk 4 as the internal enclosure in which disk4 was primarily had a problem. When i pulled disk 4 the latch of the enclosure was broken and the disk was not maintained as it should.
  7. when i run xfs_repair -n on disk4 i have an input/output error.
  8. i start thinking trouble is with the hotplug enclosure in which i have disk 2, 3 and 4.
  9. ddrescue went well and i recovered all files. I did a new config with the new disk3. Parity sync started but disk4 had been disabled. So i guess i have to stop array to replace disk4. I have no 6Tb disk spare.
  10. i planned the second option but i was not aware of the first which seems more simple. but i think i have to check files integrity on the cloned disk before syncing parity.
  11. ok thank you for following up my steps. currently running 5% rescued and going on... after re-sync i can add rescued files from ud device ?
  12. i don't know if -r3 means 3 passes or 3 scrap passes
  13. ok so it could be that: ddrescue -f -v -r3 /dev/sdq /dev/sdj /boot/ddrescue.log if all is ok i stop array replace sdq with a new one assign it as disk3 start array in normal mode mount sdj with ud. Logically a parity sync will be triggered. With krusader i can add files from ud to /mnt/disk3
  14. here are the things: physical disk3 is /dev/sdq emulated disk3 is /dev/md3 unassigned disk is /dev/sdj (precleared) my array is started in maintenance mode
  15. Ok, i was thinking to ddrescue. My plan was to use it on /dev/sdq (physical disk3) to clone it on another disk. Change disk3 with a new disk and add files that could be saved. So if i understand well what you mean. I'd better use ddrescue on /dev/md3 to clone it to another disk. If files could have been saved they are at this point on a disk which is not a member of the array. I mount the disk with unassigned devices. Then replace disk3 with a new one. A parity sync will then start. At the end of the parity sync process i can add files on disk3 from the unassign