March 27, 20179 yr Call Trace error appeared one day. I went into it, downloaded my diag as requested, and I am here to post for assistance. Tried a simple reboot of my system to see if the error remained, and now I cannot get her to come back. Diag attached, thanks in advance nextgen-tower-diagnostics-20170327-1804.zip
March 27, 20179 yr Mar 26 16:14:15 NextGen-Tower kernel: XFS (md3): _xfs_buf_find: Block out of range: block 0x874704438, EOFS 0xe8e08870 Check Disk Filesystem on disk 3
March 28, 20179 yr Author thanks sir, i see that in the log now to. I am searching for a command to determine which drive is my disk3 (parity possibly) before I proceed. Is there an easy way to determine this?
March 28, 20179 yr its /dev/md3 You want to follow the directions in that link. Running the commands against the base drive will invalidate your parity
March 28, 20179 yr Author from my lsscsi.log file, I see this. would I be correct to say that disk3 is my WD black 2tb drive (WDC WD2002FAEX-0)? [3:0:0:0] disk ATA WDC WD2002FAEX-0 1D05 /dev/sdd /dev/sg3 state=running queue_depth=1 scsi_level=6 type=0 device_blocked=0 timeout=30 dir: /sys/bus/scsi/devices/3:0:0:0 [/sys/devices/pci0000:00/0000:00:1f.2/ata3/host3/target3:0:0/3:0:0:0] Edited March 28, 20179 yr by JGKos
March 28, 20179 yr Author yes, I am following the link directions, but was unsure of my parity drive. unraid is my first dive into linux in general, so everything here is new to me. going to take baby steps here thx
March 28, 20179 yr Probably, but either do the XFS checks via the webUI, or if you want from the command prompt, use /dev/md3
March 28, 20179 yr Just now, JGKos said: yes, I am following the link directions, but was unsure of my parity drive. unraid is my first dive into linux in general, so everything here is new to me. going to take baby steps here thx Easiest way to see what disk is what is just by looking at the Main tab
March 28, 20179 yr Author Yea, after my reboot, i cannot get ui back up so command line looks to be my only option
March 28, 20179 yr Author reseating the usb key allowed me the option to boot into unraid gui mode, which i have. but i cannot get the gui to load via the web browser Edited March 28, 20179 yr by JGKos
March 28, 20179 yr I should have explained better. Wanted you to pull the stick, pop it in another computer and modify config/disk.cfg and set arrayStart="no"
March 28, 20179 yr Author done, rebooting gui is now up. and only now do i see the importance of that check "do not start array upon start up" I will follow the link now to further troubleshoot. much thanks Squid...much thanks
March 28, 20179 yr Author results are in: Not available Phase 1 - find and verify superblock... - block cache size set to 2666096 entries Phase 2 - using internal log - zero log... zero_log: head block 513384 tail block 513014 - scan filesystem freespace and inode maps... Metadata corruption detected at xfs_agf block 0x74704441/0x200 flfirst 118 in agf 2 too large (max = 118) agf 118 freelist blocks bad, skipping freelist scan freeblk count 5 != flcount 6 in ag 3 agi unlinked bucket 63 is 16759103 in ag 2 (inode=2164242751) sb_icount 126528, counted 70528 sb_ifree 296, counted 5721 sb_fdblocks 41520637, counted 42557659 - found root inode chunk Phase 3 - for each AG... - scan (but don't clear) agi unlinked lists... - process known inodes and perform inode discovery... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - process newly discovered inodes... Phase 4 - check for duplicate blocks... - setting up duplicate extent list... - check for inodes claiming duplicate blocks... - agno = 0 - agno = 2 - agno = 3 - agno = 1 No modify flag set, skipping phase 5 Phase 6 - check inode connectivity... - traversing filesystem ... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - traversal finished ... - moving disconnected inodes to lost+found ... disconnected dir inode 2164242751, would move to lost+found Phase 7 - verify link counts... would have reset inode 2164242751 nlinks from 0 to 2 No modify flag set, skipping filesystem flush and exiting. XFS_REPAIR Summary Mon Mar 27 20:02:13 2017 Phase Start End Duration Phase 1: 03/27 20:01:58 03/27 20:01:58 Phase 2: 03/27 20:01:58 03/27 20:01:59 1 second Phase 3: 03/27 20:01:59 03/27 20:02:06 7 seconds Phase 4: 03/27 20:02:06 03/27 20:02:06 Phase 5: Skipped Phase 6: 03/27 20:02:06 03/27 20:02:13 7 seconds Phase 7: 03/27 20:02:13 03/27 20:02:13 Total run time: 15 seconds am i correct to say my next step should be running: xfs_repair -v /dev/md3 from the options box?
March 28, 20179 yr Author it does not appear to be working, all i get is " not available ", and relists the xfs_repair commands xfs_repair -v /dev/md3 i checked to insure that my drives were labeled "md3", and i see them as /dev/sdd [3:0:0:0] disk ATA WDC WD2002FAEX-0 1D05 /dev/sdd /dev/sg3 state=running queue_depth=1 scsi_level=6 type=0 device_blocked=0 timeout=30 dir: /sys/bus/scsi/devices/3:0:0:0 [/sys/devices/pci0000:00/0000:00:1f.2/ata3/host3/target3:0:0/3:0:0:0] I modified the command to xfs_repair -v /dev/sdd, no dice...same for /dev/sg3
March 28, 20179 yr If the system is still in maintenance mode, then it sounds like the drive dropped offline. disk #3 is /dev/md3 If you really want you can run it against /dev/sdd1 but you will invalidate parity. @johnnie.black though is the real expert around here.
March 28, 20179 yr Author yes: Started - Maintenance Mode drive is online per the gui>Main page i suspect the drive is bad, and will look to replace. I should have enough space to move the data off this drive to the rest of the drives. Is there a FAQ for that? That way i can just remove the drive fully, and add a new one later. thanks again Edited March 28, 20179 yr by JGKos
March 28, 20179 yr 51 minutes ago, JGKos said: I modified the command to xfs_repair -v /dev/sdd, no dice...same for /dev/sg3 Please don't experiment with this. If you do the wrong thing you will make things worse. Don't try anything at all without asking first. Ypu have made the wrong guess several times in this thread already. 32 minutes ago, JGKos said: i suspect the drive is bad Very unlikely to be the case since the SMART for the disk looks OK. Connections are by far the most common problem.
March 28, 20179 yr Author Agreed, and prob getting ahead of myself Just trying to figure this out myself It's good to know the drive should be ok as well. I appreciate the assistance Ideas on next steps? Thanks
March 28, 20179 yr According to the diags, the drive is still there. Can't really tell if its in maintenance mode, but try the repair again via the UI, and post the actual output / errors that appear
March 28, 20179 yr Author from the options box -nv output: Not available Phase 1 - find and verify superblock... - block cache size set to 2666096 entries Phase 2 - using internal log - zero log... zero_log: head block 513384 tail block 513014 - scan filesystem freespace and inode maps... Metadata corruption detected at xfs_agf block 0x74704441/0x200 flfirst 118 in agf 2 too large (max = 118) agf 118 freelist blocks bad, skipping freelist scan freeblk count 5 != flcount 6 in ag 3 agi unlinked bucket 63 is 16759103 in ag 2 (inode=2164242751) sb_icount 126528, counted 70528 sb_ifree 296, counted 5721 sb_fdblocks 41520637, counted 42557659 - found root inode chunk Phase 3 - for each AG... - scan (but don't clear) agi unlinked lists... - process known inodes and perform inode discovery... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - process newly discovered inodes... Phase 4 - check for duplicate blocks... - setting up duplicate extent list... - check for inodes claiming duplicate blocks... - agno = 1 - agno = 2 - agno = 3 - agno = 0 No modify flag set, skipping phase 5 Phase 6 - check inode connectivity... - traversing filesystem ... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - traversal finished ... - moving disconnected inodes to lost+found ... disconnected dir inode 2164242751, would move to lost+found Phase 7 - verify link counts... would have reset inode 2164242751 nlinks from 0 to 2 No modify flag set, skipping filesystem flush and exiting. XFS_REPAIR Summary Mon Mar 27 21:56:39 2017 Phase Start End Duration Phase 1: 03/27 21:56:24 03/27 21:56:24 Phase 2: 03/27 21:56:24 03/27 21:56:25 1 second Phase 3: 03/27 21:56:25 03/27 21:56:32 7 seconds Phase 4: 03/27 21:56:32 03/27 21:56:32 Phase 5: Skipped Phase 6: 03/27 21:56:32 03/27 21:56:39 7 seconds Phase 7: 03/27 21:56:39 03/27 21:56:39 Total run time: 15 seconds from the options box -v /dev/md3 output: Not available Usage: xfs_repair [options] device Options: -f The device is a file -L Force log zeroing. Do this as a last resort. -l logdev Specifies the device where the external log resides. -m maxmem Maximum amount of memory to be used in megabytes. -n No modify mode, just checks the filesystem for damage. -P Disables prefetching. -r rtdev Specifies the device where the realtime section resides. -v Verbose output. -c subopts Change filesystem parameters - use xfs_admin. -o subopts Override default behaviour, refer to man page. -t interval Reporting interval in seconds. -d Repair dangerously. -V Reports version and exits.
Archived
This topic is now archived and is closed to further replies.