July 30, 20241 yr I just rebooted my unRAID VM and about 30-60 minutes after that i got three emails explaining errors on 9 drives and parity 2 offline.. What could it be? is it fixable? Edited August 13, 20241 yr by DigitalLF
July 30, 20241 yr Author On 7/30/2024 at 11:18 PM, JorgeB said: Please post the diagnostics. Oh! Sorry i have never used that before. I'm in a bit of panic right now! It has been acting up last few days.. unRAID says its the Parity 2 ( WR500ZT6 ) that is offline. Edited August 13, 20241 yr by DigitalLF removed LOCAL_TLD="xyz"
July 31, 20241 yr Community Expert All disks dropped offline at the same time, suggesting a power, connection or controller problem, reboot and post new diags after array start.
July 31, 20241 yr Author On 7/31/2024 at 9:05 AM, JorgeB said: All disks dropped offline at the same time, suggesting a power, connection or controller problem, reboot and post new diags after array start. Here is a new one. The start of the array just kept spinning at starting so after a while i had to reload the interface... "SDH", "SDI" stayed offline so this does not feel great.. Edited August 13, 20241 yr by DigitalLF
July 31, 20241 yr Community Expert I don't seen any errors in the syslog, you now have two disabled disks but that was expected, post a screenshot of main.
July 31, 20241 yr Author I just talked with the store where i got the HBA card and he said that the RAID version of the HBA card i got had problems with the tension tabs holding the cooler degraded after about 5 years and my HBA is 6 years.. So i now have to dissemble my storage room and take down shelf to access my rack server to be able to pull it out and check the HBA... Last few days my unRAID have been very unstable but i guessed it was related to that i have been rebuilding my network and setting up VLAN for the first time but without changing VLAN the last 2 days it still loose connection from time to time so something is bad..
July 31, 20241 yr Community Expert Emulated disk7 is mounting, assuming contents look correct you can the rebuild on top and re-sync parity at the same time: https://docs.unraid.net/unraid-os/manual/storage-management#rebuilding-a-drive-onto-itself Up to you if you want to try now or once you have the new controller, cannot say for sure that was the problem.
July 31, 20241 yr Author Well i've started removing shelves so i will have a look but it has been crazy unstable last week... motherboard is 12y and HBA is 6y... Can't afford stuff now or even over time... so this sucks.. BIG TIME..
July 31, 20241 yr Author 3 hours ago, JorgeB said: Emulated disk7 is mounting, assuming contents look correct you can the rebuild on top and re-sync parity at the same time: https://docs.unraid.net/unraid-os/manual/storage-management#rebuilding-a-drive-onto-itself Up to you if you want to try now or once you have the new controller, cannot say for sure that was the problem. It could be my backplane because i couldn't see anything wrong with the HBA and the backplane is made by Gooxi and i had bad experience before from their backplane. The cooler was mounted like it should on the HBA. Is it safe to unselect both drives in "step 2"? I will do this in "Maintenance mode". I don't trust my system because of all of this..
July 31, 20241 yr Community Expert 47 minutes ago, DigitalLF said: Is it safe to unselect both drives in "step 2"? Yep, same as they being currently disabled.
July 31, 20241 yr Author 2 hours ago, JorgeB said: Yep, same as they being currently disabled. And here in Step 6 should i reselect both the data drive and the parity drive?
July 31, 20241 yr Community Expert Yes, you can do both at the same time, since they are already disabled there's no extra risk.
August 1, 20241 yr Author On 7/31/2024 at 11:32 PM, JorgeB said: Yes, you can do both at the same time, since they are already disabled there's no extra risk. The rebuild did stop with errors. What can i check to don't mess this up more? Edited August 13, 20241 yr by DigitalLF
August 1, 20241 yr Community Expert Same problem as before, main suspects would be a power issue or the controller.
August 1, 20241 yr Author 4 minutes ago, JorgeB said: Same problem as before, main suspects would be a power issue or the controller. I did check the HBA and also updated the firmware yesterday but the case is a cheep Gooxi rack chassi with backplane and my suspect is the backplane... i have no way of testing that as of right now.. but i will try to find cables to do so..
August 1, 20241 yr Author 22 minutes ago, JorgeB said: Same problem as before, main suspects would be a power issue or the controller. Also does this fail mean that i have lost lots of data? I will next step connect a drive to the motherboard and just install windows and see if its stable and after that maybe i could connect all drives without backplane directly to the motherboard..
August 1, 20241 yr Community Expert 4 hours ago, DigitalLF said: Also does this fail mean that i have lost lots of data? Not necessarily, but you need to reboot to see the current array status.
August 1, 20241 yr Author 5 hours ago, JorgeB said: Not necessarily, but you need to reboot to see the current array status. Do i need to start the array? I did get a email with the subject "Data-Rebuild finished (897 errors)" and "Description: Canceled" Edited August 1, 20241 yr by DigitalLF
August 1, 20241 yr Author On 8/1/2024 at 9:34 PM, JorgeB said: Yes, to see if the emulated disk7 is still mounting Starting now While just browsing the array now i get more error emails sure its just 8 errors so far but something is bad... should i buy breakout cables and try without backplane? Any thoughts on what the next step would be? Edited August 13, 20241 yr by DigitalLF
August 1, 20241 yr Community Expert I don't see errors in the diags, but probably the same issue, but disk7 ws still mounting, which is good. 1 hour ago, DigitalLF said: should i buy breakout cables and try without backplane? It's worth a try, and I think there's no point in trying again without changing something.
August 1, 20241 yr Author 47 minutes ago, JorgeB said: I don't see errors in the diags, but probably the same issue, but disk7 ws still mounting, which is good. It's worth a try, and I think there's no point in trying again without changing something. Good to hear! just while typing the last thing i wrote errors per drive increase by hundreds.. You can see it in the screenshot.. I will order 2x CBL-SAST-0948 cables to try to see if it is the backplane or what.. Thank you so much @JorgeB It really does mean a lot to me that you have taken time to help me out because stress has always been a enormous problem for my health and this crash has not done me well.. But you helping me calmed me down a quite bit. So again. Thank you so much.
August 3, 20241 yr Author @JorgeB I got the two Mini SAS (SFF-8643 Host) till 4X SATA cables from Amazon today. I was just thinking ... What if i disconnect all the drives and then just connect a random disk (1tb) i have over and do a preclear with the new breakout cables.. then i eliminate the backplane's from the equation. Wouldn't that show errors the same way if it's HBA, MB, RAM, PSU? Just needed to start the array before to see the errors starting to count up fast.. What else is there i can do? I can connect 8 of 9 drives on the new breakout cables and one on motherboard so that would be fine i guess.. IF the preclear shows no errors..
August 4, 20241 yr Community Expert You can try, but the problem may only manifest itself with multiple drives running.
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.