mfarlow Posted February 3 Share Posted February 3 Okay, I finally surrender. I have an ongoing issue where every time I run a parity check one of the disks goes bad (red X). This has been going on for close to a year now. I run my parity check at the begining of the month each month, and each month the parity starts runs for a short time then the array goes bad and I am left with a bad disk. I should mention this started happening after I added my 10th data drive to the array. I should also mention that it is not the same disk going bad. One month it is disk2, the next month it might be disk8 or disk6. When I run a smart check on the disk, it comes back with no errors. I am currently running Unraid 6.12.6 Right now I am sitting with the array stopped so the issue doesn't progess any further. In the past I would perform a new config, reassign the drives and it will run for a short time before the disk goes bad and becomes unmountable. At that point I would run the built-in File System Check took, which would tell me that something is corrupted and it tries to rebuild it. (Sorry I forget what gets corrupted in the file system, probably one of the nodes.) At that point I would pull the drive replace it with a new one and everything will work again until next month. Initially I thought it might have been the drives, and just kept replacing them with new drives. But then it started happening with the new drives as well. So I turned my attention to the HBA. My parity and cache plug directly into the motherboard SATA ports. My HBA supports 8 data drives (the other 2 are on the mobo). I figured maybe the HBA was struggling so I replaced it. I was able to get 1 parity check out of it before the issue returned. Next I decided to run 2 HBA's and split the load across them, 4 drives each. Again the issue returned. At this point I thought maybe the temps in the case were too high (they were), So I added cooling fans on top of the HBA's which drastically reduced the temps. The issue still occurred. At this point I thought perhaps it was a power draw issue. I have 4 data drives connected to a Silverston CP06 -E4 power splitter. I have 4 drives on each splitter. All together I am running 3 splitters. I decided to add 2 more splitters for the data drives so there is only 2 drives per sliptter. I also added another SATA power cable to my power supply so that I have more power connections to spread around. Again none of these seemed to help. I am not worried too much about data loss as I have backups, but it is getting to be a PITA having to restore backups every single month. We're talking about 10-14 TB of data to restore every month. So at this point I am tired of banging my head against the wall and was hoping someone from this forum might have a suggestion or idea that I can try. tower-syslog-20240203-1851.zip tower-diagnostics-20240203-1350.zip Quote Link to comment
itimpi Posted February 3 Share Posted February 3 The sym[toms do sound very much like insufficient power. You never actually mentioned what PSU you are using? Quote Link to comment
Solution mfarlow Posted February 3 Author Solution Share Posted February 3 I am running a Segotep 650W Gold 80plus. https://www.amazon.com/gp/product/B0832F6NS8/ref=ppx_yo_dt_b_search_asin_title?ie=UTF8&psc=1 Quote Link to comment
JorgeB Posted February 4 Share Posted February 4 Disk looks fine and it's not logged as a disk problem, so most likely power/connection, if it's happening to multiple disks power would be a good place to start, see if you can test a different PSU, also make sure no power splitters are in use, or at least an acceptable amount only. Quote Link to comment
mfarlow Posted February 5 Author Share Posted February 5 I currently have 2 silverstone splitters each running 4 drives (the type with the capacitors). I have actually replaced those already, just to be safe. I'm going to try rewiring so I only have 2 drives per splitter, maybe that will help. In the meantime I will try to aquire another power supply from and see if that helps. Do you think my current power supply might be underpowered? I appreciate alll the help! Quote Link to comment
JorgeB Posted February 5 Share Posted February 5 35 minutes ago, mfarlow said: Do you think my current power supply might be underpowered? Not really, though I don't recognize that brand, but the PSU may just be starting to have issues. Quote Link to comment
mfarlow Posted March 8 Author Share Posted March 8 I was finally able to replace my power supply. I ordered a 650W EVGA that was on the A-Tier of the PSU list. First one took 2 weeks to arrive from Amazon. Then due to work I was unable to rewire my Unraid server for a while. Finally got the replacement PSU in, but one of the SATA ports on the PSU was bad which limited me to powering only 4 drives, so off to order another replacement. Turns out they sent me a returned PSU. The 2nd replacement arrived and I was able to quickly get it installed. I was able to start the array, perform a new config to get rid of the red X on my "bad" drive and run a full parity check which took a couple of days. The parity drive completed, but oddly I received an error message that there were x number of errors (it was alot). But after running a smart report I found no errors with any of the drives. So far the drives have not turned bad. I assume the error I saw very breifly had to do with the parity drive being in a error state during the parity check. I think it is too early to say for certain it was the power supply, but for now it is working fine. I plan on running another parity check in a week or so to see if the issue returns. For now, I am backing up the data just in case it happens again. I want a full backup before I run the parity check. I wanted to thank everyone who chimed in on this. I was getting pretty frustrated with UnRaid, and was considering switching to another storage solution. Glad I didn't have to switch. 1 Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.