Gsusking2 Posted June 22, 2021 Share Posted June 22, 2021 (edited) Hello Everyone. I ran into a problem today when i accidentally took out a drive from my array while it was still running. I quickly put it back *hotswap* and then shut down the server and did a reboot. When rebooting it gave me disk errors on a few disks and i could not start the array. I refreshed the config to be able to start the array and get some info. It seems that 3 disks are having write errors. but only when the array starts. I have tried reseating my mini sas cables and believe it to be a problem with the HBA. But just to make sure i have attached my diagnostics. The drives giving me trouble are SDQ SDR & SDS They seem to show up and be able to work, but then they throw errors and show up as unassigned disks. If anyone has the time to look at this and can confirm my suspicions. Or tell me if im totally eff'd and should just buy new drives. or to be ready to kiss my media goodbye. Whatever the case i would really appreciate any knowledge on the subject and any guidance tower-diagnostics-20210622-1914.zip Edited June 23, 2021 by Gsusking2 clarity Quote Link to comment
JorgeB Posted June 23, 2021 Share Posted June 23, 2021 6 hours ago, Gsusking2 said: it to be a problem with the HBA. It's possible, but first I would swap those disk with others from different slots and see if the problem follows the disks or stays with the slots. Quote Link to comment
Gsusking2 Posted June 23, 2021 Author Share Posted June 23, 2021 5 hours ago, JorgeB said: It's possible, but first I would swap those disk with others from different slots and see if the problem follows the disks or stays with the slots. Yes I have tried that. Seems to be same problem. I have even changed my sas expander. Changed some wires too. is it possible it’s a power problem. I have 16 drives on one molex line. Quote Link to comment
itimpi Posted June 23, 2021 Share Posted June 23, 2021 33 minutes ago, Gsusking2 said: s it possible it’s a power problem. I have 16 drives on one molex line. That could definitely be a problem! Normally no more than about 4 would be recommended and you want to avoid using too many splitters as they are also often a contributor to power issues. Quote Link to comment
Gsusking2 Posted July 4, 2021 Author Share Posted July 4, 2021 (edited) On 6/23/2021 at 2:23 AM, JorgeB said: It's possible, but first I would swap those disk with others from different slots and see if the problem follows the disks or stays with the slots. Ok so i did that, but it seemed to follow the slots in unraid, then one would get better, and others would fail. It seems i lost a few disks. I replaced them, but still having the same problems. @itimpi i have attached the latest logs. I lost 3-4 disks, my shares are unavail. Some of the drives were formated. I think i totally Effed myself please give any advice. or recomend i restart from the start? I would be sad to lose everything but nothing was irreplaceable. Just a big media collection. No backups except for music tower-syslog-20210704-0327.zip Edited July 4, 2021 by Gsusking2 Quote Link to comment
JorgeB Posted July 4, 2021 Share Posted July 4, 2021 3 hours ago, Gsusking2 said: Some of the drives were formated. Any data on formatted drives would be lost, if you want more advice for now please post the diagnostics: Tools -> Diagnostics (after array start). Quote Link to comment
Gsusking2 Posted July 4, 2021 Author Share Posted July 4, 2021 (edited) 3 hours ago, JorgeB said: Any data on formatted drives would be lost, if you want more advice for now please post the diagnostics: Tools -> Diagnostics (after array start). Thank you for the guidance, Attached are the Diagnostics When the parity rebuild starts, disks 1, 2, 7,8,9, 13 start to throw millions of read errors. and Disk 8 has an unmountable file system tower-diagnostics-20210704-0620.zip Edited July 4, 2021 by Gsusking2 clarity Quote Link to comment
JorgeB Posted July 4, 2021 Share Posted July 4, 2021 7 hours ago, Gsusking2 said: but it seemed to follow the slots in unraid, Do you mean the problem happens is certain slots? If yes if suggests a backplane issue, could also be a power issue depending on how the backplane is powered. Don't replace or format any more disks, just try do to the parity sync after replacing the backplane/checking power, if there are still read errors in multiple disks there's still a problem. Quote Link to comment
Gsusking2 Posted July 4, 2021 Author Share Posted July 4, 2021 (edited) 46 minutes ago, JorgeB said: Do you mean the problem happens is certain slots? If yes if suggests a backplane issue, could also be a power issue depending on how the backplane is powered. Don't replace or format any more disks, just try do to the parity sync after replacing the backplane/checking power, if there are still read errors in multiple disks there's still a problem. I mean I would change the disk. Change the slot on my chassis. Now that I look closer it’s just indiscriminately throwing errors it seems. I have 2 molex lines coming from my 1300w power supply. I am thinking about getting some sata to molex to spread the power across another cord they are both used to power the fan wall and the 6 molex connections on the backplane. I will buy new splitters and test with those. I also bought new sas cables to test from hba to backplane. Edited July 4, 2021 by Gsusking2 Quote Link to comment
Gsusking2 Posted July 24, 2021 Author Share Posted July 24, 2021 Still struggling with this, Replacing wires and SAS expanders and SAS card on monday. Is there anything i should post here to show you guys whats going on?? im fairly lost Quote Link to comment
JorgeB Posted July 24, 2021 Share Posted July 24, 2021 6 hours ago, Gsusking2 said: Replacing wires and SAS expanders and SAS card on monday. First see if this helps. Quote Link to comment
Gsusking2 Posted September 2, 2021 Author Share Posted September 2, 2021 On 7/24/2021 at 5:15 AM, JorgeB said: First see if this helps. Phase 1 - find and verify superblock... bad primary superblock - bad CRC in superblock !!! attempting to find secondary superblock... ...Sorry, could not find valid secondary superblock Exiting now. Im still unable to xfs repair the 2 disks that have the red X on them. What steps should or can i take from here? Quote Link to comment
itimpi Posted September 2, 2021 Share Posted September 2, 2021 1 hour ago, Gsusking2 said: Phase 1 - find and verify superblock... bad primary superblock - bad CRC in superblock !!! attempting to find secondary superblock... ...Sorry, could not find valid secondary superblock Exiting now. Im still unable to xfs repair the 2 disks that have the red X on them. What steps should or can i take from here? Are you trying the xfs_repair from the GUI or the command line? If the command line exactly what command are you trying? Quote Link to comment
trurl Posted September 2, 2021 Share Posted September 2, 2021 1 hour ago, Gsusking2 said: xfs repair the 2 disks that have the red X on them and xfs repair isn't the way you fix red X. Are the emulated disk unmountable? Post new diagnostics. Quote Link to comment
Gsusking2 Posted September 2, 2021 Author Share Posted September 2, 2021 1 hour ago, trurl said: and xfs repair isn't the way you fix red X. Are the emulated disk unmountable? Post new diagnostics. Diagnostics posted, the emulated disks are unmountable yes. tower-diagnostics-20210902-0751.zip Quote Link to comment
Gsusking2 Posted September 2, 2021 Author Share Posted September 2, 2021 1 hour ago, itimpi said: Are you trying the xfs_repair from the GUI or the command line? If the command line exactly what command are you trying? I tried to do it from the GUI, tho before i had tried it from the command line. Let me know a command to try and i will give it a shot. Quote Link to comment
JorgeB Posted September 2, 2021 Share Posted September 2, 2021 You're still having multiple disk errors: Sep 2 07:38:55 tower kernel: md: disk10 read error, sector=10128 Sep 2 07:38:55 tower kernel: md: disk8 read error, sector=12800 The disabled disks can't be correctly emulated with errors on additional disks. Quote Link to comment
Gsusking2 Posted September 2, 2021 Author Share Posted September 2, 2021 (edited) 1 hour ago, JorgeB said: You're still having multiple disk errors: Sep 2 07:38:55 tower kernel: md: disk10 read error, sector=10128 Sep 2 07:38:55 tower kernel: md: disk8 read error, sector=12800 The disabled disks can't be correctly emulated with errors on additional disks. ok, should i try a repair of disk 8 and 10? i also have replacement drives on standby Edited September 2, 2021 by Gsusking2 Quote Link to comment
JorgeB Posted September 2, 2021 Share Posted September 2, 2021 28 minutes ago, Gsusking2 said: ok, should i try a repair of disk 8 and 10? Those type of errors are usually bad power/connection, what have you replaced so far? Quote Link to comment
Gsusking2 Posted September 7, 2021 Author Share Posted September 7, 2021 On 9/2/2021 at 12:12 PM, JorgeB said: Those type of errors are usually bad power/connection, what have you replaced so far? Ok as of today I have replaced all sas cables and have tried to use different power supplies. My problem has progressed. Since changing out all the wiring my chassis (norco 4224) isn’t recognizing a lot of disk. I have swapped hba to test that. have swapped sas expanders to check that have hooked the hba direct to each back plane to test. I even have a spare chassis (norco 4220) And still some drives don’t show up. let me know if I should post diagnostics. Tho I can’t start my array. Quote Link to comment
JorgeB Posted September 7, 2021 Share Posted September 7, 2021 5 minutes ago, Gsusking2 said: And still some drives don’t show up. If the drives still don't show up after replacing all that make sure they are good by seeing if they are detected in a different computer. Quote Link to comment
Gsusking2 Posted September 7, 2021 Author Share Posted September 7, 2021 1 minute ago, JorgeB said: If the drives still don't show up after replacing all that make sure they are good by seeing if they are detected in a different computer. can i safely plug them into my windows gaming pc? would i just check in the bios? sorry im fairly new, but do understand most concepts and instrucitons. Quote Link to comment
trurl Posted September 7, 2021 Share Posted September 7, 2021 7 minutes ago, Gsusking2 said: can i safely plug them into my windows gaming pc? would i just check in the bios? sorry im fairly new, but do understand most concepts and instrucitons. Yes, Windows won't do anything with them as long as you don't format them. BIOS should be enough to detect Quote Link to comment
Gsusking2 Posted September 9, 2021 Author Share Posted September 9, 2021 On 9/7/2021 at 10:44 AM, trurl said: Yes, Windows won't do anything with them as long as you don't format them. BIOS should be enough to detect Ok as it seems the drives are no longer good. I think my backplane might have fried them. In this case with loosing so many disks at once, Inc a parity. Is it just a start over scenario? (no i dont have a back up, This was mainly a media server with all replaceable content.) Quote Link to comment
JorgeB Posted September 9, 2021 Share Posted September 9, 2021 54 minutes ago, Gsusking2 said: Is it just a start over scenario? You can do a new config to keep the data in the remaing data drives. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.