bobobeastie Posted October 25, 2019 Author Share Posted October 25, 2019 (edited) Great, thank you very much. I'm going to play it safe and take your advice about using a new drive. Wen't to bestbuy.com and it let me know a 10tb easystore in my cart had gone down in price, so I'm going to take that as a sign, put that in my main server, replace an 8tb drive, and use that 8tb drive here. I take it this fixes Disk9, and 8 will be rebuilt from parity? Also, is it likely that bad data controllers had anything to do with causing any of these issues? Edited October 25, 2019 by bobobeastie Quote Link to comment
JorgeB Posted October 26, 2019 Share Posted October 26, 2019 9 hours ago, bobobeastie said: I take it this fixes Disk9, and 8 will be rebuilt from parity? It will re-enable disk9 and start rebuilding disk8, success depends on if parity is still valid. 9 hours ago, bobobeastie said: Also, is it likely that bad data controllers had anything to do with causing any of these issues? Very possibly, especially the Marvell controller you were using, they are known to drop disks without a reason sometimes. Quote Link to comment
bobobeastie Posted October 26, 2019 Author Share Posted October 26, 2019 Good to have a probable explanation, the 4 port one was Marvell 88SE9215, the 2 port one was Asmedia 1061. So I think having an enterprise level SAS card that I have used without issue for over a year will be good. Quote Link to comment
bobobeastie Posted October 29, 2019 Author Share Posted October 29, 2019 Excellent, thank you, I put the different/"new" drive in as disk 8, followed instructions exactly, and the "new" disk 8 is listed as "Unmountable: No file system", but thanks to your detailed instructions I did not stop the array or select to format the drive.. Once the rebuild is done, and I run a file system check, what will the outcome be? I assume what is really being checked is the emulated contents based on the other drives and parity, so if that emulated drive is fixed, am I then able to rebuild the drives contents again, or does it magically become mountable in emulation and on the physical drive, and everything goes back to normal? Or is 8 lost and it's good that I kept the old disk to the side where I can try to use the 89% that did not get reformatted? Quote Link to comment
JorgeB Posted October 29, 2019 Share Posted October 29, 2019 7 hours ago, bobobeastie said: Once the rebuild is done, and I run a file system check, what will the outcome be? Depends on how bad the corruption is, it should be fixable in most cases. Quote Link to comment
bobobeastie Posted October 29, 2019 Author Share Posted October 29, 2019 (edited) It finished, I reloaded the page, stopped the array, then started in maintenance mode, and in the page for disk 8, it shows as Unmountable: No file system and fs type is set to auto. blkid: /dev/loop0: TYPE="squashfs" /dev/loop1: TYPE="squashfs" /dev/sda1: LABEL_FATBOOT="UNRAID" LABEL="UNRAID" UUID="2732-64F5" TYPE="vfat" PARTUUID="a3760dfe-01" /dev/nvme0n1p1: UUID="f69130dd-7800-43c1-8fe6-1409cc4d3060" TYPE="crypto_LUKS" /dev/sdb1: UUID="59de781d-b037-441a-b7eb-c918e2ed2d49" TYPE="crypto_LUKS" PARTUUID="ce24456f-79a7-425b-926b-908c829c8719" /dev/sdc1: UUID="ffb9c825-16f5-49bb-9225-58b349c15524" TYPE="crypto_LUKS" PARTUUID="62e353df-95ac-41a2-98e6-aaec2b37913d" /dev/sdd1: UUID="74cc9054-0ad6-4c5a-b17e-ffa174b8816a" TYPE="crypto_LUKS" PARTUUID="b6ae2bc8-aad2-489f-bb4f-b354371d9511" /dev/sde1: UUID="36fb8f2a-7833-4dfc-8838-af76fb89a733" TYPE="crypto_LUKS" PARTUUID="2f100d44-a1b4-4e39-94ca-388105318d81" /dev/sdf1: UUID="f6804684-df06-42ab-9afc-ec4277c848f2" TYPE="crypto_LUKS" PARTUUID="51acfce5-61fb-4788-961a-2b34b6115fa6" /dev/sdh1: UUID="023c43b4-cff7-45b8-bc7c-df5e85630455" TYPE="crypto_LUKS" PARTUUID="a45371be-e202-4a81-9604-ffc1d7591bc5" /dev/sdi1: UUID="a5241765-df2b-4966-ba7f-38fed7ae6d58" TYPE="crypto_LUKS" PARTUUID="a74c7aaa-ff16-4885-bad4-1aab9a3b39ce" /dev/sdj1: UUID="afc0186b-5d48-4888-bdcc-99e3c17af950" TYPE="crypto_LUKS" PARTUUID="57111893-5e4f-428e-a927-6d96fdef8fd2" /dev/sdk1: UUID="023c43b4-cff7-45b8-bc7c-df5e85630455" TYPE="crypto_LUKS" PARTUUID="e4acd90a-fcfd-4b45-a68a-1bc496acd051" /dev/sdl1: UUID="d41db265-f644-4ad9-9c8e-78e38673af04" TYPE="crypto_LUKS" PARTUUID="31a34a62-a0f0-45d1-97f2-3d103dab2d76" /dev/md1: UUID="023c43b4-cff7-45b8-bc7c-df5e85630455" TYPE="crypto_LUKS" /dev/md2: UUID="d41db265-f644-4ad9-9c8e-78e38673af04" TYPE="crypto_LUKS" /dev/md3: UUID="a5241765-df2b-4966-ba7f-38fed7ae6d58" TYPE="crypto_LUKS" /dev/md4: UUID="59de781d-b037-441a-b7eb-c918e2ed2d49" TYPE="crypto_LUKS" /dev/md5: UUID="ffb9c825-16f5-49bb-9225-58b349c15524" TYPE="crypto_LUKS" /dev/md6: UUID="36fb8f2a-7833-4dfc-8838-af76fb89a733" TYPE="crypto_LUKS" /dev/md7: UUID="f6804684-df06-42ab-9afc-ec4277c848f2" TYPE="crypto_LUKS" /dev/md8: UUID="afc0186b-5d48-4888-bdcc-99e3c17af950" TYPE="crypto_LUKS" /dev/md9: UUID="023c43b4-cff7-45b8-bc7c-df5e85630455" TYPE="crypto_LUKS" /dev/mapper/md1: UUID="d1c0645c-cf5b-4589-bd2f-6dccc0f99467" TYPE="xfs" /dev/mapper/md2: UUID="b83db605-8817-4174-9db9-b7e43e533179" TYPE="xfs" /dev/mapper/md3: UUID="db2b3d1c-513a-4b32-bb55-5ca4df663303" TYPE="xfs" /dev/mapper/md4: UUID="f17d514e-699f-4939-b22e-83ee770c67d7" TYPE="xfs" /dev/mapper/md5: UUID="0a7c834d-88fc-4318-85c5-a69a7449f1dc" TYPE="xfs" /dev/mapper/md6: UUID="3aea003c-7173-4efb-bfec-a775d9ebe4cf" TYPE="xfs" /dev/mapper/md7: UUID="af81136a-8131-4341-b705-f6c50638961f" TYPE="xfs" /dev/mapper/md8: UUID="196ad532-7693-46cf-ad40-13bdccc057cf" TYPE="xfs" /dev/mapper/md9: UUID="291b9458-9fa2-4d95-a68f-2c31eecf5d57" TYPE="xfs" /dev/mapper/nvme0n1p1: UUID="0ee7ecd1-bff0-43c7-b1e7-def11ff953c3" UUID_SUB="229084e1-41a2-4fbc-ab3e-e7a73d2c48d4" TYPE="btrfs" /dev/nvme0n1: PTTYPE="dos" /dev/sdg1: UUID="b88n:m?f-7ldi-4;>5-c:nl-o=6j?n<ccec0" TYPE="crypto_LUKS" PARTUUID="14cef639-350a-4daf-bfc0-ee5239c0ec62" sdk1 and sdh1 are the same uuid, what should my next step be? edit: The drive is not showing up in the mapper part so I'm guessing that means I cant run the command xfs_admin -U generate /dev/mapper/mdX, because there is no corresponding md value. edit2: Disk Log has this error: Oct 29 14:40:03 Tower kernel: print_req_error: I/O error, dev sdj, sector 15628052928 Edited October 29, 2019 by bobobeastie Quote Link to comment
JorgeB Posted October 30, 2019 Share Posted October 30, 2019 Please post the diagnostics: Tools -> Diagnostics Quote Link to comment
bobobeastie Posted October 30, 2019 Author Share Posted October 30, 2019 Diagnostics tower-diagnostics-20191030-0902.zip Quote Link to comment
JorgeB Posted October 30, 2019 Share Posted October 30, 2019 Diags after starting the array please Quote Link to comment
bobobeastie Posted October 30, 2019 Author Share Posted October 30, 2019 Ok, thought maintenance mode would do it tower-diagnostics-20191030-0956.zip Quote Link to comment
JorgeB Posted October 30, 2019 Share Posted October 30, 2019 Problem isn't the duplicate UUID, duplicate UUIDs are just on the LUKS device, and although strange it might be normal on the LUKS devices, I don't use encryption so not sure, in any case the xfs filesystem have different UUIDs: /dev/mapper/md1: UUID="d1c0645c-cf5b-4589-bd2f-6dccc0f99467" TYPE="xfs" /dev/mapper/md8: UUID="196ad532-7693-46cf-ad40-13bdccc057cf" TYPE="xfs" And the problem is just standard filesytem corruption: Oct 30 02:55:55 Tower kernel: XFS (dm-7): Metadata CRC error detected at xfs_sb_read_verify+0x111/0x15f [xfs], xfs_sb_quiet block 0xffffffffffffffff Oct 30 02:55:55 Tower kernel: XFS (dm-7): Unmount and run xfs_repair Run xfs_repair on disk8 Quote Link to comment
bobobeastie Posted October 30, 2019 Author Share Posted October 30, 2019 (edited) Sorry, I may have forgotten to mention that xfs_repair is not listed on the page for disk 8, while it is for others, even in maintenance mode. Can I run it using the terminal? If so what command? edit: Not available unmounted either. Edited October 30, 2019 by bobobeastie Quote Link to comment
JorgeB Posted October 30, 2019 Share Posted October 30, 2019 xfs_repair -v /dev/mapper/md8 Quote Link to comment
bobobeastie Posted October 30, 2019 Author Share Posted October 30, 2019 Should that be in maintenance mode? Unmounted: root@Tower:~# xfs_repair -v /dev/mapper/md8 /dev/mapper/md8: No such file or directory /dev/mapper/md8: No such file or directory fatal error -- couldn't initialize XFS library Quote Link to comment
bobobeastie Posted October 30, 2019 Author Share Posted October 30, 2019 Look like yes, in maintenance mode: root@Tower:~# xfs_repair -v /dev/mapper/md8 Phase 1 - find and verify superblock... bad primary superblock - bad CRC in superblock !!! attempting to find secondary superblock... .found candidate secondary superblock... verified secondary superblock... writing modified primary superblock - block cache size set to 722176 entries Phase 2 - using internal log - zero log... zero_log: head block 18 tail block 18 - scan filesystem freespace and inode maps... Metadata CRC error detected at 0x439356, xfs_agf block 0x15d50b4c1/0x200Metadata CRC error detected at 0x439356, xfs_agf block 0x246312d41/0x200 Metadata CRC error detected at 0x463086, xfs_agi block 0x246312d42/0x200 Metadata CRC error detected at 0x463086, xfs_agi block 0x15d50b4c2/0x200 bad uuid 196ad532-7693-46cf-5887-4b8d0df5f997 for agi 6 reset bad agi for ag 6 bad uuid 196ad532-7693-46cf-38ff-0f39a7ed4fe1 for agi 10 reset bad agi for ag 10 Metadata CRC error detected at 0x438f94, xfs_agfl block 0x246312d43/0x200 agfl has bad CRC for ag 10 bad agbno 1156485499 in agfl, agno 10 bad agbno 1703632283 in agfl, agno 10 bad agbno 4274230279 in agfl, agno 10 bad agbno 3276528919 in agfl, agno 10 bad agbno 1321478119 for btbno root, agno 10 bad agbno 1891151798 for btbcnt root, agno 10 agf_freeblks 60595936, counted 0 in ag 10 agf_longest 13360461, counted 0 in ag 10 bad agbno 3307702038 for finobt root, agno 6 bad agbno 3783116261 for finobt root, agno 10 agi_freecount 363, counted 0 in ag 10 finobt sb_icount 18560, counted 2816 sb_ifree 221, counted 465 sb_fdblocks 4653142, counted 1265621349 - found root inode chunk Phase 3 - for each AG... - scan and clear agi unlinked lists... - process known inodes and perform inode discovery... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - agno = 4 - agno = 5 - agno = 6 - agno = 7 - agno = 8 - agno = 9 - agno = 10 - agno = 11 - process newly discovered inodes... Phase 4 - check for duplicate blocks... - setting up duplicate extent list... - check for inodes claiming duplicate blocks... - agno = 0 - agno = 2 - agno = 10 - agno = 4 - agno = 6 - agno = 5 - agno = 7 - agno = 8 - agno = 1 - agno = 9 - agno = 11 - agno = 3 Phase 5 - rebuild AG headers and trees... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - agno = 4 - agno = 5 - agno = 6 - agno = 7 - agno = 8 - agno = 9 - agno = 10 - agno = 11 - reset superblock... Phase 6 - check inode connectivity... - resetting contents of realtime bitmap and summary inodes - traversing filesystem ... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - agno = 4 - agno = 5 - agno = 6 - agno = 7 - agno = 8 - agno = 9 - agno = 10 - agno = 11 - traversal finished ... - moving disconnected inodes to lost+found ... Phase 7 - verify and correct link counts... Note - stripe unit (0) and width (0) were copied from a backup superblock. Please reset with mount -o sunit=<value>,swidth=<value> if necessary XFS_REPAIR Summary Wed Oct 30 03:32:19 2019 Phase Start End Duration Phase 1: 10/30 03:32:18 10/30 03:32:18 Phase 2: 10/30 03:32:18 10/30 03:32:18 Phase 3: 10/30 03:32:18 10/30 03:32:19 1 second Phase 4: 10/30 03:32:19 10/30 03:32:19 Phase 5: 10/30 03:32:19 10/30 03:32:19 Phase 6: 10/30 03:32:19 10/30 03:32:19 Phase 7: 10/30 03:32:19 10/30 03:32:19 Total run time: 1 second done After that I can start the array!! Thank you very much, time to try to recover some things. Quote Link to comment
bobobeastie Posted October 30, 2019 Author Share Posted October 30, 2019 Everything is good, I found some files on a disk that had been replaced and added them back to the array. I'm trying to mount the disk 8 that had been part of the array but that I kept out while fixing it, it won't mount in unassigned devices, and I noticed this in the log: Oct 30 15:31:17 Tower unassigned.devices: Adding disk '/dev/mapper/HGST_HDN726060ALE614_K1H90MAD'... Oct 30 15:31:17 Tower unassigned.devices: luksOpen error: Device HGST_HDN726060ALE614_K1H90MAD already exists. Oct 30 15:31:17 Tower unassigned.devices: Partition 'HGST_HDN726060ALE614_K1H90MAD' could not be mounted... Can anything be done to get it mounted? tower-diagnostics-20191030-2349.zip Quote Link to comment
JorgeB Posted October 31, 2019 Share Posted October 31, 2019 7 hours ago, bobobeastie said: luksOpen error: Device HGST_HDN726060ALE614_K1H90MAD already exists. It's not mounting because of this, it should mount with the array stopped. Quote Link to comment
bobobeastie Posted November 3, 2019 Author Share Posted November 3, 2019 Won't mount when server is freshly booted and array off, but there's probably no mechanism to read an inputted key at this point, won't mount when array has been started and stopped either. Nov 3 03:00:05 Tower unassigned.devices: Adding disk '/dev/mapper/HGST_HDN726060ALE614_K1H90MAD'... Nov 3 03:00:05 Tower unassigned.devices: luksOpen error: Device HGST_HDN726060ALE614_K1H90MAD already exists. Nov 3 03:00:05 Tower unassigned.devices: Partition 'HGST_HDN726060ALE614_K1H90MAD' could not be mounted... I'm ready to give up on the drive if there's nothing I can do, just want to check this last time. I'm hoping after a pre-clear or two that I can add it as a new disk. tower-diagnostics-20191103-1101.zip Quote Link to comment
JorgeB Posted November 3, 2019 Share Posted November 3, 2019 I don't use encryption, you can ask for help on the UD support thread. Quote Link to comment
bobobeastie Posted November 7, 2019 Author Share Posted November 7, 2019 (edited) Thank you very much @johnnie.black a gentleman and a scholar👍, couldn't have doe it without your help. edit: Not sure how these get market solved, looks like maybe a mod needs to do it, if one sees this please feel free to mark this solved. Edited November 7, 2019 by bobobeastie solved Quote Link to comment
itimpi Posted November 7, 2019 Share Posted November 7, 2019 7 minutes ago, bobobeastie said: Not sure how these get market solved You can edit the title in the first post of the thread to add (SOLVED) to the title. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.