geekdomo Posted January 9, 2022 Author Share Posted January 9, 2022 So it was rebuilding and then it froze. I had to hard reset and when I did this is what came up. Whenever I reboot a different drive shows up as red dot. It is moving around. I cannot rebuild the array now because when I start it throws 100% errors. It was useable I was starting to transfer my pictures off to my local drive and thats when it froze. Quote Link to comment
JorgeB Posted January 9, 2022 Share Posted January 9, 2022 There are errors in 8 disks, that suggests a controller issue, assuming they share one, diags if you didn't yet reboot might give more info. Quote Link to comment
geekdomo Posted January 9, 2022 Author Share Posted January 9, 2022 Holy crap. I connected to it and everything is gone. I have no shares and the drives are empty. Is there any way to fix this? Archive.zip Quote Link to comment
geekdomo Posted January 9, 2022 Author Share Posted January 9, 2022 (edited) Ok its back up but the drive says its unmountable. I did get my files back I assume that drive is being replicated through parity. Trying to get my most important files off before it goes down again. Any tips would be appreciated at this point. I realized the reason it was blank was because I had started it in maintenance mode. Im scrambling right now trying to save my work so forgive me if I make a mistake. Edited January 9, 2022 by geekdomo Quote Link to comment
JorgeB Posted January 9, 2022 Share Posted January 9, 2022 Diags are after rebooting so we can't see what happened, but most likely and like mentioned it was a controller issue, try using it in a different PCIe slot if available, and if there are more errors grab the diags before rebooting. Quote Link to comment
geekdomo Posted January 9, 2022 Author Share Posted January 9, 2022 1 minute ago, JorgeB said: Diags are after rebooting so we can't see what happened, but most likely and like mentioned it was a controller issue, try using it in a different PCIe slot if available, and if there are more errors grab the diags before rebooting. I am going to hold off doing anything physical for a little bit. Its back up and supposedly repairing the array with parity. I think its moving the parity bits to the working drives as they are being written to. I am going to try to get my most important stuff off asap. I have ~30TB on my desk. Going to move as much there as I can. I have new parts (including a new controller) coming next week so I may just limp along grabbing what I can. Not ideal but its mildly doable. Once I migrate the system to the new case + new controller I will write back with how its behaving. Quote Link to comment
trurl Posted January 9, 2022 Share Posted January 9, 2022 The most recent screenshot showed disk8 mounted and with about 1TB of data, so that part looks like it may have worked. Your most recent diagnostics showed disk3 missing and disabled. The array wasn't started though so can't tell if any disks are unmountable. 2 hours ago, geekdomo said: I think its moving the parity bits to the working drives as they are being written to. Not even close to how things actually work. https://wiki.unraid.net/Manual/Overview#Parity-Protected_Array Since you only have single parity, it can only rebuild one drive. That one drive would have a lot of writes, others not so much, but those others would all have a lot of reads since all the disks are read to get the result of the parity calculation to write to the rebuilding disk. 1 Quote Link to comment
geekdomo Posted January 9, 2022 Author Share Posted January 9, 2022 20 minutes ago, trurl said: The most recent screenshot showed disk8 mounted and with about 1TB of data, so that part looks like it may have worked. Your most recent diagnostics showed disk3 missing and disabled. The array wasn't started though so can't tell if any disks are unmountable. Not even close to how things actually work. https://wiki.unraid.net/Manual/Overview#Parity-Protected_Array Since you only have single parity, it can only rebuild one drive. That one drive would have a lot of writes, others not so much, but those others would all have a lot of reads since all the disks are read to get the result of the parity calculation to write to the rebuilding disk. I am testing a backup motherboard as I am running out of ideas what it might be. Its very random. Boots are insanely slow. I have several old mobos around. Testing a new one now Quote Link to comment
geekdomo Posted January 9, 2022 Author Share Posted January 9, 2022 Installed new Motherboard. Now 4 drives are missing. I am beginning to lean that its the power supply or cable for power. The 3 drives are all on the same daisy chain sata power cable. Reason I am leaning more towards that is 2 of the missing drives are plugged into the motherboard and not the controller (as I was out of room on the sas controller). My new case coming has 16 slots on a single backbone (Power and SAS). I may wait until it gets here (Tuesday) to test this out any further as to preserve the drives and my sanity. Quote Link to comment
trurl Posted January 9, 2022 Share Posted January 9, 2022 Some things to consider going forward, not about your hardware issues, but about how to proceed with your disks when you get the hardware issues fixed Maybe you already know this, but if any disk that should have data on it shows as Unmountable, DON'T FORMAT. You must repair the filesystem instead. Parity can't recover a formatted disk because after formatting a disk in the parity array, parity agrees that the disk has been formatted. This applies not only to a disk that isn't disabled, but also to a disabled/emulated disk. Since there have been so many problems with so many disks, probably better if you don't attempt to rebuild on top of any existing disk that might have data on it. Instead rebuild to a disk that isn't already in the array, and save that original disk in case it is needed to try to recover files if there are problems with the rebuild. If you don't have spares and don't want to buy any, before rebuilding on top of an existing disk, at least confirm that the emulated disk doesn't say it is Unmountable, and that the emulated files all seem to be OK. The emulated disk contents are exactly what the result of a successful rebuild will be. 1 Quote Link to comment
geekdomo Posted January 9, 2022 Author Share Posted January 9, 2022 (edited) 2 hours ago, trurl said: Some things to consider going forward, not about your hardware issues, but about how to proceed with your disks when you get the hardware issues fixed Maybe you already know this, but if any disk that should have data on it shows as Unmountable, DON'T FORMAT. You must repair the filesystem instead. Parity can't recover a formatted disk because after formatting a disk in the parity array, parity agrees that the disk has been formatted. This applies not only to a disk that isn't disabled, but also to a disabled/emulated disk. Since there have been so many problems with so many disks, probably better if you don't attempt to rebuild on top of any existing disk that might have data on it. Instead rebuild to a disk that isn't already in the array, and save that original disk in case it is needed to try to recover files if there are problems with the rebuild. If you don't have spares and don't want to buy any, before rebuilding on top of an existing disk, at least confirm that the emulated disk doesn't say it is Unmountable, and that the emulated files all seem to be OK. The emulated disk contents are exactly what the result of a successful rebuild will be. I think I understand. I have not formatted any disks just trying to recover them via parity. When the array is up (I turned the whole system off for now while waiting for my parts), and there are no missing drives, (that started after replacing the motherboard), I can copy the files off the array. That is encouraging because I can save most of my work. I really do feel its a power cable. I have had this system running for around 6 years now and it has had many different motherboards/cases/drives. Because the current iteration is a 4u case (Way too big) But only has cages and not hot swap via a backplane I am running SAS split out to individual connectors and a daisy-chained SATA power cable. I think its that power cable. I do not have a spare of that but my new case is arriving Tuesday. The new case has a fully powered backplane and has 2 standard SAS ports that I can plug into my card. I have also ordered a new sas card to have as a backup. With the new case I can isolate the issue being cables and power which will help a ton right now. Thanks again for your help. Hoping once I rebuild it will be functional and I can mark it as solved. Edited January 9, 2022 by geekdomo Quote Link to comment
geekdomo Posted January 13, 2022 Author Share Posted January 13, 2022 Ok, I am back. Well the server is sorta. I hope its fixable. I have a brand new case, one with a proper backplane. SAS controller and power distribution. The server is running however there are 2 drives that say unmountable. Where we last left off the drives were "failing" marked red because (I believe) the daisy chain power cord was bad. Now I have proper power distribution all of the drives appear online however the 2 drives appear unmountable with 1 being repaired with parity rebuild. What should I do at this point? Should I let it go and repair it? Then what? Thanks (next post I will post diags) Quote Link to comment
geekdomo Posted January 13, 2022 Author Share Posted January 13, 2022 Here are my diagnostics. tank-diagnostics-20220113-1235.zip Quote Link to comment
geekdomo Posted January 13, 2022 Author Share Posted January 13, 2022 Ok this is the result for Disk 3 running the check -n Phase 1 - find and verify superblock... - reporting progress in intervals of 15 minutes Phase 2 - using internal log - zero log... - 08:48:32: zeroing log - 7632 of 7632 blocks done - scan filesystem freespace and inode maps... bad magic number Metadata CRC error detected at 0x43a516, xfs_agf block 0x4a886901/0x200 agf has bad CRC for ag 40 Metadata CRC error detected at 0x465ab6, xfs_agi block 0x4a886902/0x200 agi has bad CRC for ag 40 bad on-disk superblock 40 - bad magic number primary/secondary superblock 40 conflict - AG superblock geometry info conflicts with filesystem geometry would zero unused portion of secondary superblock (AG #40) bad magic # 0xaf275efa for agf 40 bad version # 1328099507 for agf 40 bad sequence # 189338519 for agf 40 bad length -1980524498 for agf 40, should be 3907668 flfirst -1486662249 in agf 40 too large (max = 118) fllast -1347596919 in agf 40 too large (max = 118) bad uuid e2791e17-7505-a00f-d9ce-026c1e2b579c for agf 40 bad magic # 0xb147897c for agi 40 bad version # 482171254 for agi 40 bad sequence # 2013798562 for agi 40 bad length # -328340048 for agi 40, should be 3907668 bad uuid 0d210c10-af38-f0a3-3bd4-f995b5668666 for agi 40 would reset bad sb for ag 40 would reset bad agf for ag 40 would reset bad agi for ag 40 bad uncorrected agheader 40, skipping ag... sb_icount 92928, counted 92672 sb_ifree 18772, counted 18606 sb_fdblocks 526796607, counted 522891235 - 08:48:36: scanning filesystem freespace - 249 of 250 allocation groups done - found root inode chunk Phase 3 - for each AG... - scan (but don't clear) agi unlinked lists... - 08:48:36: scanning agi unlinked lists - 250 of 250 allocation groups done - process known inodes and perform inode discovery... - agno = 15 - agno = 30 - agno = 90 - agno = 75 - agno = 45 - agno = 0 - agno = 105 - agno = 60 - agno = 120 - agno = 135 - agno = 165 - agno = 180 - agno = 195 - agno = 150 - agno = 225 - agno = 240 - agno = 210 - agno = 91 - agno = 121 - agno = 61 - agno = 151 - agno = 46 - agno = 211 - agno = 76 - agno = 92 - agno = 16 - agno = 196 - agno = 166 - agno = 106 - agno = 181 - agno = 226 - agno = 241 - agno = 197 - agno = 212 - agno = 152 - agno = 122 - agno = 182 - agno = 167 - agno = 198 - agno = 123 - agno = 227 - agno = 213 - agno = 228 - agno = 199 - agno = 153 - agno = 62 - agno = 136 - agno = 93 - agno = 107 - agno = 124 - agno = 94 - agno = 154 - agno = 168 - agno = 31 - agno = 183 - agno = 108 - agno = 200 - agno = 125 - agno = 47 - agno = 95 - agno = 155 - agno = 63 - agno = 77 - agno = 32 - agno = 109 - agno = 137 - agno = 78 - agno = 110 - agno = 201 - agno = 156 - agno = 48 - agno = 17 - agno = 1 - agno = 111 - agno = 126 - agno = 79 - agno = 138 - agno = 64 - agno = 33 - agno = 202 - agno = 214 - agno = 169 - agno = 96 - agno = 127 - agno = 242 - agno = 229 - agno = 157 - agno = 97 - agno = 80 - agno = 203 - agno = 49 - agno = 215 - agno = 98 - agno = 34 - agno = 230 - agno = 65 - agno = 184 - agno = 139 - agno = 158 - agno = 112 - agno = 128 - agno = 129 - agno = 170 - agno = 99 - agno = 113 - agno = 114 - agno = 100 - agno = 159 - agno = 216 - agno = 115 - agno = 140 - agno = 204 - agno = 50 - agno = 35 - agno = 81 - agno = 18 - agno = 51 - agno = 36 - agno = 130 - agno = 160 - agno = 66 - agno = 116 - agno = 117 - agno = 101 - agno = 82 - agno = 118 - agno = 243 - agno = 102 - agno = 185 - agno = 19 - agno = 2 - agno = 103 - agno = 67 - agno = 231 - agno = 161 - agno = 37 - agno = 52 - agno = 38 - agno = 83 - agno = 104 - agno = 53 - agno = 68 - agno = 39 - agno = 141 - agno = 119 - agno = 205 - agno = 69 - agno = 162 - agno = 20 - agno = 84 - agno = 54 - agno = 55 - agno = 186 - agno = 217 - agno = 85 - agno = 131 - agno = 171 - agno = 70 - agno = 244 - agno = 232 - agno = 86 - agno = 163 - agno = 206 - agno = 142 - agno = 132 - agno = 187 - agno = 133 - agno = 164 - agno = 21 - agno = 3 - agno = 245 - agno = 218 - agno = 56 - agno = 188 - agno = 143 - agno = 172 - agno = 134 - agno = 87 - agno = 233 - agno = 144 - agno = 71 - agno = 189 - agno = 4 - agno = 219 - agno = 40 - agno = 41 - agno = 207 - agno = 57 - agno = 72 - agno = 88 - agno = 220 - agno = 234 - agno = 22 - agno = 208 - agno = 190 - agno = 145 - agno = 42 - agno = 58 - agno = 89 - agno = 173 - agno = 146 - agno = 191 - agno = 174 - agno = 209 - agno = 235 - agno = 221 - agno = 236 - agno = 147 - agno = 246 - agno = 73 - agno = 5 - agno = 43 - agno = 59 - agno = 23 - agno = 6 - agno = 74 - agno = 44 - agno = 24 - agno = 175 - agno = 25 - agno = 148 - agno = 7 - agno = 192 - agno = 237 - agno = 149 - agno = 26 - agno = 8 - agno = 27 - agno = 176 - agno = 28 - agno = 9 - agno = 193 - agno = 29 - agno = 177 - agno = 194 - agno = 178 - agno = 238 - agno = 247 - agno = 248 - agno = 239 - agno = 249 - agno = 222 - agno = 223 - agno = 179 - agno = 10 - agno = 224 - agno = 11 - agno = 12 - agno = 13 - agno = 14 - 08:48:47: process known inodes and inode discovery - 92672 of 92928 inodes done - process newly discovered inodes... - 08:48:47: process newly discovered inodes - 500 of 250 allocation groups done Phase 4 - check for duplicate blocks... - setting up duplicate extent list... - 08:48:47: setting up duplicate extent list - 250 of 250 allocation groups done - check for inodes claiming duplicate blocks... - agno = 0 - agno = 1 - agno = 2 - agno = 7 - agno = 8 - agno = 10 - agno = 3 - agno = 13 - agno = 6 entry "Nicole RL" at block 0 offset 680 in directory inode 335544416 references non-existent inode 1342177378 would clear inode number in entry at offset 680... - agno = 5 - agno = 15 - agno = 16 - agno = 17 - agno = 11 - agno = 18 - agno = 19 - agno = 20 - agno = 21 - agno = 12 - agno = 22 - agno = 23 - agno = 24 - agno = 25 - agno = 9 - agno = 26 - agno = 27 - agno = 28 entry "layer" in shortform directory 734378473 references non-existent inode 1342177395 would have junked entry "layer" in directory inode 734378473 entry "de" in shortform directory 905969790 references non-existent inode 1342177376 would have junked entry "de" in directory inode 905969790 - agno = 30 - agno = 31 - agno = 32 - agno = 33 entry "75628" in shortform directory 943323502 references non-existent inode 1358959046 would have junked entry "75628" in directory inode 943323502 entry "35L" in shortform directory 943323544 references non-existent inode 1358955675 would have junked entry "35L" in directory inode 943323544 - agno = 34 entry "bucket 2" at block 0 offset 216 in directory inode 319419305 references non-existent inode 1358959060 would clear inode number in entry at offset 216... - agno = 35 - agno = 36 entry "ViewPublic" in shortform directory 1068227613 references non-existent inode 1358959043 would have junked entry "ViewPublic" in directory inode 1068227613 - agno = 37 - agno = 38 entry "vehicle" in shortform directory 1275068519 references non-existent inode 1358959042 would have junked entry "vehicle" in directory inode 1275068519 entry "Win2K" in shortform directory 1023412938 references non-existent inode 1342177466 would have junked entry "Win2K" in directory inode 1023412938 - agno = 40 - agno = 41 - agno = 42 - agno = 43 - agno = 44 - agno = 45 - agno = 46 - agno = 47 - agno = 48 - agno = 49 - agno = 50 - agno = 51 - agno = 52 - agno = 53 - agno = 54 - agno = 55 - agno = 57 - agno = 58 - agno = 59 - agno = 4 - agno = 60 - agno = 61 - agno = 62 - agno = 63 - agno = 64 - agno = 65 - agno = 66 - agno = 67 - agno = 68 - agno = 14 - agno = 70 - agno = 56 - agno = 72 - agno = 73 - agno = 74 - agno = 75 - agno = 39 - agno = 76 - agno = 77 entry "PNGS" in shortform directory 1308623001 references non-existent inode 1342177380 would have junked entry "PNGS" in directory inode 1308623001 - agno = 78 - agno = 71 - agno = 79 - agno = 80 - agno = 81 - agno = 82 - agno = 83 - agno = 84 - agno = 85 - agno = 86 - agno = 87 - agno = 88 - agno = 89 - agno = 90 - agno = 91 - agno = 92 - agno = 93 - agno = 94 - agno = 29 - agno = 96 - agno = 97 - agno = 98 - agno = 99 - agno = 100 - agno = 101 - agno = 102 - agno = 103 - agno = 104 - agno = 105 - agno = 106 - agno = 107 - agno = 108 - agno = 109 - agno = 110 - agno = 111 - agno = 112 - agno = 114 - agno = 115 - agno = 116 - agno = 95 - agno = 117 - agno = 118 - agno = 119 - agno = 120 - agno = 121 - agno = 122 - agno = 123 - agno = 124 - agno = 125 - agno = 126 - agno = 127 - agno = 128 - agno = 129 - agno = 130 - agno = 131 - agno = 132 - agno = 133 - agno = 134 - agno = 135 - agno = 136 - agno = 137 - agno = 138 - agno = 139 - agno = 140 - agno = 141 - agno = 142 - agno = 143 - agno = 69 - agno = 145 - agno = 146 - agno = 147 - agno = 148 - agno = 150 - agno = 149 - agno = 151 - agno = 152 - agno = 153 - agno = 154 - agno = 155 - agno = 156 - agno = 157 - agno = 158 - agno = 159 - agno = 160 - agno = 161 - agno = 162 - agno = 163 - agno = 164 - agno = 165 - agno = 166 - agno = 167 entry "7" in shortform directory 4395632429 references non-existent inode 1358959049 would have junked entry "7" in directory inode 4395632429 - agno = 168 - agno = 169 - agno = 170 - agno = 144 entry "38701" at block 0 offset 3000 in directory inode 5624557818 references non-existent inode 1358959047 would clear inode number in entry at offset 3000... - agno = 171 - agno = 172 - agno = 173 - agno = 174 - agno = 175 - agno = 176 - agno = 177 - agno = 178 - agno = 179 - agno = 180 - agno = 181 entry "mt0014" at block 0 offset 408 in directory inode 6013583046 references non-existent inode 1342177388 would clear inode number in entry at offset 408... - agno = 182 entry "page0002" in shortform directory 5949564318 references non-existent inode 1358959057 would have junked entry "page0002" in directory inode 5949564318 - agno = 183 - agno = 184 - agno = 185 - agno = 186 - agno = 187 - agno = 188 - agno = 190 - agno = 191 - agno = 192 - agno = 193 - agno = 194 - agno = 195 - agno = 196 entry "crack" in shortform directory 6394219821 references non-existent inode 1342177470 would have junked entry "crack" in directory inode 6394219821 - agno = 197 - agno = 199 - agno = 200 - agno = 201 - agno = 202 - agno = 203 - agno = 204 - agno = 205 - agno = 206 entry "passage1" in shortform directory 6681474117 references non-existent inode 1342177387 would have junked entry "passage1" in directory inode 6681474117 - agno = 207 - agno = 208 - agno = 209 - agno = 210 - agno = 211 - agno = 212 - agno = 213 - agno = 214 - agno = 215 - agno = 216 - agno = 217 - agno = 218 - agno = 219 entry "6" in shortform directory 6890576677 references non-existent inode 1358959054 would have junked entry "6" in directory inode 6890576677 - agno = 220 - agno = 221 - agno = 222 - agno = 223 - agno = 224 - agno = 198 - agno = 225 - agno = 226 - agno = 227 - agno = 189 - agno = 228 - agno = 229 - agno = 230 - agno = 231 - agno = 232 - agno = 233 - agno = 234 - agno = 235 - agno = 236 entry "Drivers" in shortform directory 7852131040 references non-existent inode 1358959041 would have junked entry "Drivers" in directory inode 7852131040 entry "2" in shortform directory 7886016084 references non-existent inode 1358959048 would have junked entry "2" in directory inode 7886016084 - agno = 237 - agno = 238 - agno = 239 - agno = 241 - agno = 242 - agno = 243 - agno = 244 - agno = 245 - agno = 246 - agno = 247 - agno = 248 - agno = 249 - agno = 240 - agno = 113 entry "(Footage)" at block 0 offset 96 in directory inode 7617116544 references non-existent inode 1358959040 would clear inode number in entry at offset 96... - 08:48:47: check for inodes claiming duplicate blocks - 92672 of 92928 inodes done No modify flag set, skipping phase 5 Phase 6 - check inode connectivity... - traversing filesystem ... entry "Win2K" in shortform directory 1023412938 references non-existent inode 1342177466 would junk entry entry "ViewPublic" in shortform directory 1068227613 references non-existent inode 1358959043 would junk entry entry "38701" in directory inode 5624557818 points to non-existent inode 1358959047, would junk entry bad hash table for directory inode 5624557818 (no data entry): would rebuild would rebuild directory inode 5624557818 entry "passage1" in shortform directory 6681474117 references non-existent inode 1342177387 would junk entry entry "(Footage)" in directory inode 7617116544 points to non-existent inode 1358959040, would junk entry bad hash table for directory inode 7617116544 (no data entry): would rebuild would rebuild directory inode 7617116544 entry "6" in shortform directory 6890576677 references non-existent inode 1358959054 would junk entry entry "7" in shortform directory 4395632429 references non-existent inode 1358959049 would junk entry entry "vehicle" in shortform directory 1275068519 references non-existent inode 1358959042 would junk entry entry "crack" in shortform directory 6394219821 references non-existent inode 1342177470 would junk entry entry "layer" in shortform directory 734378473 references non-existent inode 1342177395 would junk entry entry "page0002" in shortform directory 5949564318 references non-existent inode 1358959057 would junk entry entry "PNGS" in shortform directory 1308623001 references non-existent inode 1342177380 would junk entry entry "Drivers" in shortform directory 7852131040 references non-existent inode 1358959041 would junk entry entry "bucket 2" in directory inode 319419305 points to non-existent inode 1358959060, would junk entry bad hash table for directory inode 319419305 (no data entry): would rebuild would rebuild directory inode 319419305 entry "Nicole RL" in directory inode 335544416 points to non-existent inode 1342177378, would junk entry bad hash table for directory inode 335544416 (no data entry): would rebuild would rebuild directory inode 335544416 entry "de" in shortform directory 905969790 references non-existent inode 1342177376 would junk entry entry "75628" in shortform directory 943323502 references non-existent inode 1358959046 would junk entry entry "35L" in shortform directory 943323544 references non-existent inode 1358955675 would junk entry entry "2" in shortform directory 7886016084 references non-existent inode 1358959048 would junk entry entry "mt0014" in directory inode 6013583046 points to non-existent inode 1342177388, would junk entry bad hash table for directory inode 6013583046 (no data entry): would rebuild would rebuild directory inode 6013583046 - traversal finished ... - moving disconnected inodes to lost+found ... disconnected dir inode 1375731818, would move to lost+found disconnected dir inode 1409286245, would move to lost+found disconnected dir inode 1409286258, would move to lost+found disconnected dir inode 1468307365, would move to lost+found disconnected dir inode 1479327555, would move to lost+found disconnected dir inode 1479327567, would move to lost+found disconnected dir inode 1512656963, would move to lost+found disconnected dir inode 1512809041, would move to lost+found disconnected dir inode 1543507111, would move to lost+found disconnected dir inode 1577058427, would move to lost+found disconnected dir inode 1711276317, would move to lost+found disconnected dir inode 1780914726, would move to lost+found disconnected dir inode 2148532684, would move to lost+found disconnected dir inode 3626478636, would move to lost+found disconnected dir inode 3659628132, would move to lost+found disconnected dir inode 3704243783, would move to lost+found disconnected dir inode 3727704286, would move to lost+found Phase 7 - verify link counts... would have reset inode 319419305 nlinks from 26 to 25 would have reset inode 1023412938 nlinks from 4 to 3 would have reset inode 905969790 nlinks from 23 to 22 would have reset inode 1275068519 nlinks from 3 to 2 would have reset inode 1308623001 nlinks from 4 to 3 would have reset inode 1068227613 nlinks from 11 to 10 would have reset inode 4395632429 nlinks from 9 to 8 would have reset inode 5624557818 nlinks from 271 to 270 would have reset inode 6013583046 nlinks from 26 to 25 would have reset inode 5949564318 nlinks from 13 to 12 would have reset inode 6394219821 nlinks from 4 to 3 would have reset inode 6890576677 nlinks from 14 to 13 would have reset inode 7617116544 nlinks from 4 to 3 would have reset inode 7852131040 nlinks from 7 to 6 would have reset inode 7886016084 nlinks from 4 to 3 would have reset inode 6681474117 nlinks from 8 to 7 would have reset inode 734378473 nlinks from 3 to 2 would have reset inode 943323502 nlinks from 14 to 13 would have reset inode 943323544 nlinks from 14 to 13 would have reset inode 335544416 nlinks from 40 to 39 - 08:48:58: verify and correct link counts - 250 of 250 allocation groups done No modify flag set, skipping filesystem flush and exiting. Quote Link to comment
geekdomo Posted January 13, 2022 Author Share Posted January 13, 2022 Here is Check -n for Drive 1 (Also unmountable) Phase 1 - find and verify superblock... Phase 2 - using internal log - zero log... totally zeroed log - scan filesystem freespace and inode maps... sb_fdblocks 609418691, counted 610885709 - found root inode chunk Phase 3 - for each AG... - scan (but don't clear) agi unlinked lists... - process known inodes and perform inode discovery... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - agno = 4 - agno = 5 - process newly discovered inodes... Phase 4 - check for duplicate blocks... - setting up duplicate extent list... - check for inodes claiming duplicate blocks... - agno = 0 - agno = 2 - agno = 5 - agno = 3 - agno = 4 - agno = 1 No modify flag set, skipping phase 5 Phase 6 - check inode connectivity... - traversing filesystem ... - traversal finished ... - moving disconnected inodes to lost+found ... Phase 7 - verify link counts... Maximum metadata LSN (4:3113599) is ahead of log (0:0). Would format log to cycle 7. No modify flag set, skipping filesystem flush and exiting. Quote Link to comment
trurl Posted January 13, 2022 Share Posted January 13, 2022 Did you run the checks from the webUI or the command line? 1 Quote Link to comment
geekdomo Posted January 13, 2022 Author Share Posted January 13, 2022 Ok I have run xfs_repair /dev/md1 and xfs_repair /dev/md3 on both drives it says it was successful and done but both drives still say unmountable. What would be my next step? Quote Link to comment
geekdomo Posted January 13, 2022 Author Share Posted January 13, 2022 6 minutes ago, trurl said: Did you run the checks from the webUI or the command line? Both. GUI first stab at it and command line for next 2 Quote Link to comment
itimpi Posted January 13, 2022 Share Posted January 13, 2022 Have you tried restarting the array in normal mode? 1 Quote Link to comment
geekdomo Posted January 13, 2022 Author Share Posted January 13, 2022 Just now, itimpi said: Have you tried restarting the array in normal mode? Yes I have. It immediately started parity rebuild however was acting strange (to me I have never been in this situation with Unraid). It looked like it was reading from all of the drives and writing to Disk 3. It was somehow reading from disk 1 although it was also unmountable. I am assuming it was virtual. Quote Link to comment
geekdomo Posted January 13, 2022 Author Share Posted January 13, 2022 I just ran the repair with -L and it completed but both are unmountable. Should I restart the array? Quote Link to comment
itimpi Posted January 13, 2022 Share Posted January 13, 2022 Just now, geekdomo said: Yes I have. It immediately started parity rebuild however was acting strange (to me I have never been in this situation with Unraid). It looked like it was reading from all of the drives and writing to Disk 3. It was somehow reading from disk 1 although it was also unmountable. I am assuming it was virtual. This is what I would expect. Just because a drive is unmountable it does not stop its contents being included in the parity calculation and thus read for rebuild purposes. it is a bit concerning that either drive is still being flagged as unmountable. If a disk is showing as unmountable before starting the rebuild then it will still be unmountable afterwards as all the rebuild does is make the physical disk match the emulated one. That is why we normally recommend fixing the unmountable state on the emulated drive before attempting the rebuild. 1 Quote Link to comment
geekdomo Posted January 13, 2022 Author Share Posted January 13, 2022 I restarted the array and.... WAHOO!! Its working! Happily repairing Disk 3 (which by the way was never a problem all along but somehow got corrupted with all of the power faults). Thank YOU ALL especially @trurl for sticking with helping me out. I have to say I spent a few nights feeling sick to my stomach. I purchased a dropbox business account with unlimited data and will be backing up to that + my 27TB desktop array along with getting another parity drive. Quote Link to comment
Solution geekdomo Posted January 13, 2022 Author Solution Share Posted January 13, 2022 Let me summarize what the issue was for anyone that might fall down the rabbit hole I did in months/years to come. I had an older case without a proper backplane. During a simple install of a 10GB NIC upon reboot I noticed that I had a couple of drives with red * on them. They were my oldest drives so I assumed (incorrectly) that they somehow blew up during the install. I then tried to trouble shoot it through UNRAID whereas it was always a power issue. My case, being it didn't have a proper backplane and power distribution, I was using a daisy chain SATA power cable that split off from a standard SATA power connector. Somehow during the NIC install I must have broken one of the wires or traces? Not sure but that simple break caused me to chase issues that were not related to the OS. Solution: I broke down and purchased a proper 12 slot 2u Rack chassis and reinstalled all of the drives. Initially upon boot 2 of the drives that had power issues during the troubleshooting were unmounted. I used the command xfs_repair /dev/md1 -L and xfs_repair /dev/md3 -L. This repaired the file system enough so that the parity could be written. Thanks to Turl and itimpi (although they came in later it was their post on fixing unmountable drive that I referred to when fixing these). Quote Link to comment
trurl Posted January 13, 2022 Share Posted January 13, 2022 Check your lost+found share for anything repair couldn't figure out. 1 Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.