jkwaterman Posted January 24 Author Share Posted January 24 Just to let you know the background of how the parity drives got unassigned, I recovered from an april 30th backup. In august, or september my parity discs failed and I replaced them successfully. I thought that I had backups but I can't find the later backups. I do use connect, unraid.net to do weekly backups but I haven't been taking them off the unraid server for a while I guess. I know that it is too late now, but I believe that is what happened. I should have thought better of using that backup, but that is water under the dam now Quote Link to comment
JorgeB Posted January 24 Share Posted January 24 Connect the old disks in two known good slots, it can be the ones where you have the new disks 5 and 6, just to see if the old disks are detected, unlikely that they are both dead. Quote Link to comment
Pauven Posted January 24 Share Posted January 24 Okay, slow down. Up to this point, most of the advice has been either about doing tests or hypothesizing options. Anytime you think you're ready to take action, please post here your planned steps for review and approval. Anytime you take an action, you're one step closer to losing data if it is the wrong action. I believe your data is still intact, so don't give up hope. But slow down and work with the guys here, don't do anything that's not reviewed and approved. 22 hours ago, JorgeB said: That suggests something in your /config is causing the issue, you can backup the current flash drive first and then redo it and just restore the bare minimum, like the key, super.dat and the pools folder for the assignments, also copy the docker user templates folder, if all works you can then reconfigure the server or try restoring a few config files at a time from the backup to see if you can find the culprit. Did you follow this guidance to backup the current flash drive first, before restoring from backup? From a planning perspective, we need to know what options remain. Quote Link to comment
Pauven Posted January 24 Share Posted January 24 1 minute ago, JorgeB said: Connect the old disks in two known good slots, it can be the ones where you have the new disks 5 and 6, just to see if the old disks are detected, unlikely that they are both dead. I would highly recommend at least starting with the original slots where the replacement drives are being detected, remove the replacements and install the originals there. For now, don't touch any of the other "good" drives, as that could be compounding the problem, especially if you start to lose track of which drives are which. Keep it simple. Quote Link to comment
jkwaterman Posted January 24 Author Share Posted January 24 That suggests something in your /config is causing the issue, you can backup the current flash drive first and then redo it and just restore the bare minimum, like the key, super.dat and the pools folder for the assignments, also copy the docker user templates folder, if all works you can then reconfigure the server or try restoring a few config files at a time from the backup to see if you can find the culprit. I followed this first but still couldn't assign disks then went with the backup not thinking. Quote Link to comment
jkwaterman Posted January 24 Author Share Posted January 24 Okay, I swapped out the discs in bays 5 & 6 and still can't see the discs. Want me to check the servers Bios to see if they are there? Heres the newest diagnostics. tower-diagnostics-20240124-1157.zip Quote Link to comment
jkwaterman Posted January 24 Author Share Posted January 24 (edited) FYI, i backed up te config from the original flash drive before doing anything Edited January 24 by jkwaterman Quote Link to comment
JorgeB Posted January 24 Share Posted January 24 47 minutes ago, jkwaterman said: Want me to check the servers Bios to see if they are there? You can, but maybe disks 5 and 6 are really dead, it would be very strange having two disks die at the same time, there may be an underlying issue. Quote Link to comment
jkwaterman Posted January 24 Author Share Posted January 24 I rebooted and brought up the system BIOS. Both drives appear in the proper SATA ports Quote Link to comment
JorgeB Posted January 24 Share Posted January 24 Are they using the onboard SATA controller? If they show up in the BIOS they should also show up in Unraid, last diags you posted had 4 disks using the onboard SATA: Jan 24 08:11:11 Tower kernel: ata5: SATA link down (SStatus 4 SControl 300) Jan 24 08:11:11 Tower kernel: ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 300) Jan 24 08:11:11 Tower kernel: ata3: SATA link up 6.0 Gbps (SStatus 133 SControl 300) Jan 24 08:11:11 Tower kernel: ata1: SATA link down (SStatus 4 SControl 300) Jan 24 08:11:11 Tower kernel: ata4: SATA link up 6.0 Gbps (SStatus 133 SControl 300) Jan 24 08:11:11 Tower kernel: ata6: SATA link up 6.0 Gbps (SStatus 133 SControl 300) Ports 1 and 5 were down, in which ports are those disks connected? Quote Link to comment
jkwaterman Posted January 24 Author Share Posted January 24 the first 8 drives use onboard Sata. I then have 2 lsi PCIE SAS cards. Each LSI card have 8 ports a piece. You should see 8 sata drives Including the 2 original drives that I just swapped out. When Unraid disables a drive, does/can it set a flag stating that the drive is unusable? Jumping at straws here. Quote Link to comment
jkwaterman Posted January 24 Author Share Posted January 24 I think these two drives are in ports 7 & 8 on the MB. The MB actually has 10 ports. 2 of the first 6 are shared so 2 are not used Quote Link to comment
JorgeB Posted January 24 Share Posted January 24 9 minutes ago, jkwaterman said: The MB actually has 10 ports. 2 of the first 6 are shared so 2 are not used Thanks, I see that now, there are two additional 2 port Asmedia controllers, and both disks are failing two initialize in one of them: Jan 24 08:11:11 Tower kernel: ata7: SATA link up 6.0 Gbps (SStatus 133 SControl 300) Jan 24 08:11:11 Tower kernel: ata7.00: failed to IDENTIFY (INIT_DEV_PARAMS failed, err_mask=0x80) Jan 24 08:11:11 Tower kernel: ata7: limiting SATA link speed to 3.0 Gbps Jan 24 08:11:11 Tower kernel: ata7: SATA link up 3.0 Gbps (SStatus 123 SControl 320) Jan 24 08:11:11 Tower kernel: ata7.00: failed to IDENTIFY (INIT_DEV_PARAMS failed, err_mask=0x80) Jan 24 08:11:11 Tower kernel: ata7: SATA link up 3.0 Gbps (SStatus 123 SControl 320) Jan 24 08:11:11 Tower kernel: ata8: SATA link up 6.0 Gbps (SStatus 133 SControl 300) Jan 24 08:11:11 Tower kernel: ata8.00: failed to IDENTIFY (INIT_DEV_PARAMS failed, err_mask=0x80) Jan 24 08:11:11 Tower kernel: ata8: SATA link up 6.0 Gbps (SStatus 133 SControl 300) Jan 24 08:11:11 Tower kernel: ata8.00: failed to IDENTIFY (INIT_DEV_PARAMS failed, err_mask=0x80) Jan 24 08:11:11 Tower kernel: ata8: limiting SATA link speed to 3.0 Gbps Jan 24 08:11:11 Tower kernel: ata8: SATA link up 3.0 Gbps (SStatus 123 SControl 320) Jan 24 08:11:11 Tower kernel: ata8.00: failed to IDENTIFY (INIT_DEV_PARAMS failed, err_mask=0x80) Jan 24 08:11:11 Tower kernel: ata8: SATA link up 3.0 Gbps (SStatus 123 SControl 320) When this happens they can still show up in the BIOS and be a disk issue, but to be 100% certain, please try connecting these two disks to the Intel controller, you can swap with two other that are using it, if they do the same on the Intel controller I would say for sure those disks are bad. Quote Link to comment
jkwaterman Posted January 24 Author Share Posted January 24 I swapped 7 & 8 with 4 & 6 Diagnostic attached tower-diagnostics-20240124-1545.zip Quote Link to comment
jkwaterman Posted January 24 Author Share Posted January 24 I have been going through this thread while waiting for confirmation that the original disks are bad. I noticed that I made a statement that may cause others to think that I did not back up the config before going down this path. I did back up the config from the original flash drive before doing anything. I do have a super.dat file in the config folder backup. Quote Link to comment
JorgeB Posted January 25 Share Posted January 25 Jan 16 16:12:36 Tower kernel: ata6.00: failed to IDENTIFY (INIT_DEV_PARAMS failed, err_mask=0x80) Jan 16 16:12:36 Tower kernel: ata4: link is slow to respond, please be patient (ready=0) Jan 16 16:12:36 Tower kernel: ata6: SATA link up 3.0 Gbps (SStatus 123 SControl 320) Jan 16 16:12:36 Tower kernel: ata4: SATA link up 6.0 Gbps (SStatus 133 SControl 300) Jan 16 16:12:36 Tower kernel: ata4.00: failed to IDENTIFY (INIT_DEV_PARAMS failed, err_mask=0x80) Jan 16 16:12:36 Tower kernel: ata4: link is slow to respond, please be patient (ready=0) Jan 16 16:12:36 Tower kernel: ata4: COMRESET failed (errno=-16) Jan 16 16:12:36 Tower kernel: ata4: link is slow to respond, please be patient (ready=0) Same issue, I think we can conclude that the old disks really failed, which is kind of strange both failing the same way at the same time. 10 hours ago, jkwaterman said: I did back up the config from the original flash drive before doing anything. I do have a super.dat file in the config folder backup. This is from when the disks were already disabled? With the array like this: Quote Link to comment
jkwaterman Posted January 25 Author Share Posted January 25 On 1/23/2024 at 1:29 PM, JorgeB said: That suggests something in your /config is causing the issue, you can backup the current flash drive first and then redo it and just restore the bare minimum, like the key, super.dat and the pools folder for the assignments, also copy the docker user templates folder, if all works you can then reconfigure the server or try restoring a few config files at a time from the backup to see if you can find the culprit. I backed up the flash drive config at this point in time before I mucked with the April backup when we know I had parity. Yes if I understand you the two drives were disabled. Quote Link to comment
Solution JorgeB Posted January 25 Solution Share Posted January 25 You can then try a new flash drive install with that super.dat, it should show the array with both disks disabled but still valid parity, you can the see if the disks can be emulated and rebuilt. Quote Link to comment
jkwaterman Posted January 25 Author Share Posted January 25 I should place my drives back in their proper spots first. Correct? Quote Link to comment
JorgeB Posted January 25 Share Posted January 25 You can, but as long as all are connected it's not a problem. Quote Link to comment
jkwaterman Posted January 25 Author Share Posted January 25 Good News, I think. I believe I can try the rebuild now. Quote Link to comment
JorgeB Posted January 25 Share Posted January 25 You can first unassign both new disks, then start the array, to confirm the emulated disks are mounting before rebuilding, they may need a filesystem check, and if they do, better to do it before to make sure it works. Quote Link to comment
jkwaterman Posted January 25 Author Share Posted January 25 Thanks, I will. I am marking the thread solved 1 Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.