caplam

Members
  • Posts

    330
  • Joined

  • Last visited

Everything posted by caplam

  1. ok thank you. Preclear should end in 4 or 5 hours. Rebuild should take at least 3 days.
  2. finally array is started, md4 is mounted. Impossible to tell if all files are here. I have not seen any lost+found directory. Can i rebuild disk3 and 4 simultanously ? is it preferable to copy the content of md4 to an external disk ?
  3. start is not finished but log is showing that. So i guess it should be good. an 23 13:01:10 godzilla emhttpd: shcmd (849): mkdir -p /mnt/disk4 Jan 23 13:01:10 godzilla emhttpd: shcmd (850): mount -t xfs -o noatime /dev/md4 /mnt/disk4 Jan 23 13:01:10 godzilla kernel: XFS (md4): Mounting V5 Filesystem Jan 23 13:01:10 godzilla kernel: XFS (md4): Ending clean mount Jan 23 13:01:11 godzilla kernel: xfs filesystem being mounted at /mnt/disk4 supports timestamps until 2038 (0x7fffffff) Jan 23 13:01:11 godzilla emhttpd: shcmd (851): xfs_growfs /mnt/disk4 Jan 23 13:01:11 godzilla root: meta-data=/dev/md4 isize=512 agcount=6, agsize=268435455 blks Jan 23 13:01:11 godzilla root: = sectsz=512 attr=2, projid32bit=1 Jan 23 13:01:11 godzilla root: = crc=1 finobt=1, sparse=1, rmapbt=0 Jan 23 13:01:11 godzilla root: = reflink=0 Jan 23 13:01:11 godzilla root: data = bsize=4096 blocks=1465130633, imaxpct=5 Jan 23 13:01:11 godzilla root: = sunit=0 swidth=0 blks Jan 23 13:01:11 godzilla root: naming =version 2 bsize=4096 ascii-ci=0, ftype=1 Jan 23 13:01:11 godzilla root: log =internal log bsize=4096 blocks=521728, version=2 Jan 23 13:01:11 godzilla root: = sectsz=512 sunit=0 blks, lazy-count=1 Jan 23 13:01:11 godzilla root: realtime =none extsz=4096 blocks=0, rtextents=0
  4. i'll use another disk and keep actual disk3 apart. Start and stop of the array are very long. I'm waiting for the start to finish to see if md4 is mounted.
  5. i'm stopping array but it's pretty long. I will restart it in normal mode to see if md4 can be mounted. If yes i suppose the next step is rebuilding disk3 and 4 (for that i have to wait preclear ends)
  6. xfs_repair -L /dev/md4 Phase 1 - find and verify superblock... sb root inode value 18446744073709551615 (NULLFSINO) inconsistent with calculated value 128 resetting superblock root inode pointer to 128 sb realtime bitmap inode value 18446744073709551615 (NULLFSINO) inconsistent with calculated value 129 resetting superblock realtime bitmap inode pointer to 129 sb realtime summary inode value 18446744073709551615 (NULLFSINO) inconsistent with calculated value 130 resetting superblock realtime summary inode pointer to 130 Phase 2 - using internal log - zero log... ALERT: The filesystem has valuable metadata changes in a log which is being destroyed because the -L option was used. - scan filesystem freespace and inode maps... sb_icount 0, counted 40128 sb_ifree 0, counted 349 sb_fdblocks 1464608875, counted 440547342 - found root inode chunk Phase 3 - for each AG... - scan and clear agi unlinked lists... - process known inodes and perform inode discovery... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - agno = 4 - agno = 5 - process newly discovered inodes... Phase 4 - check for duplicate blocks... - setting up duplicate extent list... - check for inodes claiming duplicate blocks... - agno = 0 - agno = 3 - agno = 5 - agno = 2 - agno = 4 - agno = 1 Phase 5 - rebuild AG headers and trees... - reset superblock... Phase 6 - check inode connectivity... - resetting contents of realtime bitmap and summary inodes - traversing filesystem ... - traversal finished ... - moving disconnected inodes to lost+found ... Phase 7 - verify and correct link counts... Maximum metadata LSN (4:205143) is ahead of log (1:2). Format log to cycle 7. done
  7. xfs_repair -v /dev/md4 Phase 1 - find and verify superblock... bad primary superblock - bad CRC in superblock !!! attempting to find secondary superblock... .found candidate secondary superblock... verified secondary superblock... writing modified primary superblock - block cache size set to 6137384 entries sb root inode value 18446744073709551615 (NULLFSINO) inconsistent with calculated value 128 resetting superblock root inode pointer to 128 sb realtime bitmap inode value 18446744073709551615 (NULLFSINO) inconsistent with calculated value 129 resetting superblock realtime bitmap inode pointer to 129 sb realtime summary inode value 18446744073709551615 (NULLFSINO) inconsistent with calculated value 130 resetting superblock realtime summary inode pointer to 130 Phase 2 - using internal log - zero log... zero_log: head block 205153 tail block 205149 ERROR: The filesystem has valuable metadata changes in a log which needs to be replayed. Mount the filesystem to replay the log, and unmount it before re-running xfs_repair. If you are unable to mount the filesystem, then use the -L option to destroy the log and attempt a repair. Note that destroying the log may cause corruption -- please attempt a mount of the filesystem before doing this. with the gui i can't run it; nothing happens
  8. perhaps i don't understand correctly. When i write disk3 i mean sdg the physical disk. md3 is the "logical disk". As disk3 is disabled, md3 is emulated. Am i correct?
  9. thank you for your answer. Will do that when preclear ends. If i understood correctly: md3 is fine as it is mounted. Disk3 should be able to be rebuilt. md4 should have file system errors and if xfs_repair correct errors i should be able to rebuild disk4. so once i have md3 and md4 without error i can rebuild simultanously disk3 and disk4. Actual disk 4 (wd60EFRX) is not seen by controller. It makes some click noises at startup.
  10. Hello all, Back in september i had to stop my server for moving. When i restarted it i had problems with some disks. I finally decided to give it a chance. I bought 2 wd mybook 6Tb to shuck drives. Right now one is in the preclearing process. So the situation is: Normally my array has 2 6Tb parity disks 4 data drives: disks 1, 2 and 3 are 4Tb disks. Disk 4 is 6Tb. Now disk 1 and 2 are ok disk 3 is disabled. i can read emulated content. I tried xfs repair -L without success. The disk can't be mounted. disk 4 is not detected and i can't read emulated disk. What can i do to recover disk 3&4 content and have server back on line. For now array is started, vm and docker are disabled and a preclear is running. I attached diags. godzilla-diagnostics-20220122-1520.zip
  11. i guess moving data out of the array was not the right choice. I have to find a way. I want to drop unraid. In 2,5 years this is not the first time i'm in such trouble and it takes ages to recover. I never had such problems with my syno or my proxmox server.
  12. i'm in trouble. I was transferring data out of emulated disk 4 when disk3 had errors and is now disabled. Disk 2 started also to have errors. Now in /mnt/user/ i can't see any files which were on disk3 or 4 But i can see /mnt/disk4 but not /mnt/disk3 godzilla-diagnostics-20210929-2226.zip
  13. no i have not enough space on other disks. Right now my server is down i can't post diags. i have 2 parity disks (6Tb) I have 4 array disks (4,4,4,6 Tb). I have shares on disk 1&2 and others shares in disks 3&4. Disappeared disk is number 4. I really think disk 4 is dead as lsi controller doesn't detect it and it made some "click" noises on starting. If i can't move data from emulated disk 4 to new disks 5&6 (4Tb) i think i have 2 choices: - transfer data out of the array, add 2 disks, make a new config and transfer back data to the new disks. The downside is it will make a strong activity on parity drives. - buy a 6Tb disk to replace disk 4. but i prefer the other solution as i have 7 empty slots and 4 or 5 4Tb disks. Moreover there are no interessant deals on hdd right now.
  14. Last week i started to have read and write errors on an array disk. I did not have time to deal with this as i had to move to a new house. Yesterday i started my server and now the disk is not seen anymore (even by the controller ). The problem is i only have smaller replacement drives. How can i deal with this ? Is it possible to add 2 drives to the array and transfer from the emulated disk to the 2 new ones? i guess not as parity will logically be rebuilt when adding drives.
  15. finally got preclear running by uninstalling and reinstalling it. Guess it was a dependancy problem (perhaps due to interaction with nerd pack)
  16. I can format the disks in exFat. i can't format them in xfs Preclear still stuck on starting. I found a forth disk from an ancient raid on synology. It's the same behaviour.
  17. yes it's enabled. Usually i have no problems preclearing. But i haven't done this since a few months. Server has been rebooted 2 days ago for udpate of nvidia driver.
  18. Hello, I'm selling my synology nas and i want to preclear disks to have spare. I threw 3x 3Tb western red in my sas expansion bay as unassigned drives. Unraid recognised them as linux raid member. Preclear won't run. It keeps showing "starting" I can erase partitions but i can't format them in xfs with the gui. What can i do ? godzilla-diagnostics-20210705-1207.zip
  19. good news. Server is now running fine. Parity sync is now finished. I'll run preclear on the disks used to see if they really failing. Tank your @JorgeB for your help 😃
  20. that's it. The sata connector was almost unplugged!!! Mar 8 13:40:10 godzilla kernel: ata2.00: exception Emask 0x50 SAct 0x0 SErr 0x280900 action 0x6 frozen Mar 8 13:40:10 godzilla kernel: ata2.00: irq_stat 0x08000000, interface fatal error Mar 8 13:40:10 godzilla kernel: ata2: SError: { UnrecovData HostInt 10B8B BadCRC } Mar 8 13:40:10 godzilla kernel: ata2.00: failed command: READ DMA EXT Mar 8 13:40:10 godzilla kernel: ata2.00: cmd 25/00:00:08:1b:b8/00:01:36:00:00/e0 tag 28 dma 131072 in Mar 8 13:40:10 godzilla kernel: res 50/00:00:8f:20:b8/00:00:36:00:00/e0 Emask 0x50 (ATA bus error) Mar 8 13:40:10 godzilla kernel: ata2.00: status: { DRDY } Mar 8 13:40:10 godzilla kernel: ata2: hard resetting link Mar 8 13:40:10 godzilla kernel: ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300) Mar 8 13:40:10 godzilla kernel: ata2.00: configured for UDMA/33 Mar 8 13:40:10 godzilla kernel: sd 3:0:0:0: [sdd] tag#28 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08 cmd_age=0s Mar 8 13:40:10 godzilla kernel: sd 3:0:0:0: [sdd] tag#28 Sense Key : 0x5 [current] Mar 8 13:40:10 godzilla kernel: sd 3:0:0:0: [sdd] tag#28 ASC=0x21 ASCQ=0x4 Mar 8 13:40:10 godzilla kernel: sd 3:0:0:0: [sdd] tag#28 CDB: opcode=0x28 28 00 36 b8 1b 08 00 01 00 00 Mar 8 13:40:10 godzilla kernel: blk_update_request: I/O error, dev sdd, sector 918035208 op 0x0:(READ) flags 0x80700 phys_seg 32 prio class 0 Mar 8 13:40:10 godzilla kernel: ata2: EH complete Mar 8 13:40:14 godzilla kernel: ata2: SATA link down (SStatus 0 SControl 300) Mar 8 13:40:20 godzilla kernel: ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 300) Mar 8 13:40:20 godzilla kernel: ata2.00: configured for UDMA/33 Mar 8 13:40:21 godzilla root: Total Spundown: 0 it's running ok now. I feel like a black cat. 🙃
  21. i have that multiple times in syslog. It keeps coming up. How can i know which controller/port is involved? Mar 8 12:27:32 godzilla kernel: ata2: SError: { HostInt 10B8B LinkSeq } Mar 8 12:27:32 godzilla kernel: ata2.00: failed command: WRITE DMA EXT Mar 8 12:27:32 godzilla kernel: ata2.00: cmd 35/00:60:f8:c3:4d/00:00:1a:00:00/e0 tag 30 dma 49152 out Mar 8 12:27:32 godzilla kernel: res 50/00:00:17:85:4d/00:00:1a:00:00/e0 Emask 0x50 (ATA bus error) Mar 8 12:27:32 godzilla kernel: ata2.00: status: { DRDY } Mar 8 12:27:32 godzilla kernel: ata2: hard resetting link Mar 8 12:27:32 godzilla kernel: ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300) Mar 8 12:27:32 godzilla kernel: ata2.00: configured for UDMA/33 Mar 8 12:27:32 godzilla kernel: ata2: EH complete Mar 8 12:29:00 godzilla kernel: ata2.00: exception Emask 0x40 SAct 0x0 SErr 0x880800 action 0x6 Mar 8 12:29:00 godzilla kernel: ata2.00: irq_stat 0x40000001 Mar 8 12:29:00 godzilla kernel: ata2: SError: { HostInt 10B8B LinkSeq } Mar 8 12:29:00 godzilla kernel: ata2.00: failed command: FLUSH CACHE EXT Mar 8 12:29:00 godzilla kernel: ata2.00: cmd ea/00:00:00:00:00/00:00:00:00:00/a0 tag 10 Mar 8 12:29:00 godzilla kernel: res 51/04:00:00:00:00/00:00:00:00:00/a0 Emask 0x41 (internal error) Mar 8 12:29:00 godzilla kernel: ata2.00: status: { DRDY ERR } Mar 8 12:29:00 godzilla kernel: ata2.00: error: { ABRT } Mar 8 12:29:00 godzilla kernel: ata2: hard resetting link Mar 8 12:29:00 godzilla kernel: ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300) Mar 8 12:29:00 godzilla kernel: ata2.00: configured for UDMA/33 Mar 8 12:29:00 godzilla kernel: ata2.00: device reported invalid CHS sector 0 Mar 8 12:29:00 godzilla kernel: ata2: EH complete Mar 8 12:29:16 godzilla kernel: ata2.00: exception Emask 0x40 SAct 0x0 SErr 0x880800 action 0x6 Mar 8 12:29:16 godzilla kernel: ata2.00: irq_stat 0x40000001 Mar 8 12:29:16 godzilla kernel: ata2: SError: { HostInt 10B8B LinkSeq } Mar 8 12:29:16 godzilla kernel: ata2.00: failed command: FLUSH CACHE EXT Mar 8 12:29:16 godzilla kernel: ata2.00: cmd ea/00:00:00:00:00/00:00:00:00:00/a0 tag 22 Mar 8 12:29:16 godzilla kernel: res 51/04:00:00:00:00/00:00:00:00:00/a0 Emask 0x41 (internal error) Mar 8 12:29:16 godzilla kernel: ata2.00: status: { DRDY ERR } Mar 8 12:29:16 godzilla kernel: ata2.00: error: { ABRT } Mar 8 12:29:16 godzilla kernel: ata2: hard resetting link Mar 8 12:29:16 godzilla kernel: ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300) Mar 8 12:29:16 godzilla kernel: ata2.00: configured for UDMA/33 Mar 8 12:29:16 godzilla kernel: ata2.00: device reported invalid CHS sector 0 Mar 8 12:29:16 godzilla kernel: ata2: EH complete Mar 8 12:29:46 godzilla kernel: ata2.00: exception Emask 0x40 SAct 0x0 SErr 0x880800 action 0x6 Mar 8 12:29:46 godzilla kernel: ata2.00: irq_stat 0x40000001 Mar 8 12:29:46 godzilla kernel: ata2: SError: { HostInt 10B8B LinkSeq } Mar 8 12:29:46 godzilla kernel: ata2.00: failed command: WRITE DMA Mar 8 12:29:46 godzilla kernel: ata2.00: cmd ca/00:08:98:05:7b/00:00:00:00:00/eb tag 18 dma 4096 out Mar 8 12:29:46 godzilla kernel: res 51/04:08:98:05:7b/00:00:0b:00:00/eb Emask 0x41 (internal error) Mar 8 12:29:46 godzilla kernel: ata2.00: status: { DRDY ERR } Mar 8 12:29:46 godzilla kernel: ata2.00: error: { ABRT } Mar 8 12:29:46 godzilla kernel: ata2: hard resetting link Mar 8 12:29:46 godzilla kernel: ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300) Mar 8 12:29:46 godzilla kernel: ata2.00: configured for UDMA/33 Mar 8 12:29:46 godzilla kernel: ata2: EH complete Mar 8 12:30:02 godzilla kernel: ata2.00: exception Emask 0x50 SAct 0x0 SErr 0x880800 action 0x6 frozen Mar 8 12:30:02 godzilla kernel: ata2.00: irq_stat 0x08000000, interface fatal error Mar 8 12:30:02 godzilla kernel: ata2: SError: { HostInt 10B8B LinkSeq } Mar 8 12:30:02 godzilla kernel: ata2.00: failed command: WRITE DMA EXT Mar 8 12:30:02 godzilla kernel: ata2.00: cmd 35/00:08:f8:37:a9/00:00:36:00:00/e0 tag 20 dma 4096 out Mar 8 12:30:02 godzilla kernel: res 50/00:00:00:00:00/00:00:00:00:00/a0 Emask 0x50 (ATA bus error) Mar 8 12:30:02 godzilla kernel: ata2.00: status: { DRDY } Mar 8 12:30:02 godzilla kernel: ata2: hard resetting link Mar 8 12:30:02 godzilla kernel: ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300) Mar 8 12:30:02 godzilla kernel: ata2.00: configured for UDMA/33 Mar 8 12:30:02 godzilla kernel: ata2: EH complete edit : found: it's the port used for the m2-sata ssd of the vm-pool. I certainly had it "moved" when reinstalling the cpu2 daughter card (the server is a Z620 with 2cpu).
  22. thank you 😃 parity sync is started; but it's very slow: 20MB/S edit My bad docker service started and that was slowing down the process. now it runs at 150MB/S. Now i also have udma crc errors on cache pools (both vm and docker pool)🤥 The controller on the mainboard could also have a problem.
  23. i have UDMA CRC error count at 15 on disk4 godzilla-diagnostics-20210308-1202.zip
  24. i'm confused. I replaced sata cable, changed sata port and still I/O error. So i changed enclosure for disk 4; and still I/O error when doing check _n with gui. So in a terminal i ran xfs_repair -n on disk4 in the external enclosure and it's running but with gui i have i/o error. I have good hope for disk 4 as the internal enclosure in which disk4 was primarily had a problem. When i pulled disk 4 the latch of the enclosure was broken and the disk was not maintained as it should.
  25. when i run xfs_repair -n on disk4 i have an input/output error.