Abell Posted July 4, 2023 Posted July 4, 2023 (edited) Hello everyone, I’ve been seeing a strange issue with random drives becoming disabled when parity is ran every month. Between the parity checks each month, the drives work without issues, it's only when the party check starts that one random drive becomes disabled. When the drive rebuild process is going, none of the drives fail. It only happens during parity check. I'm guessing that for drive rebuild all of the drives need to be spun up in order to recalculate the missing drive's data, so it should be the same demand as a parity check. Breakdown: Unraid 6.12.0 Total of 7 drives 6 are connected to a SuperMicro AOC-S2308L in IT Mode via Mini SAS to SATA 1 is connected via SATA directly to the motherboard. Only fails in monthly party check No disabled drives throughout the month Drive rebuild works without issues From the disabled drives: 3 are connected to the HBA 1 is connected directly to SATA in motherboard All drives are fed by 3 separate 5-pin connections to a Cooler Master 750W PSU. I have a UPS protecting the NAS and it measures around 130W when parity is running and 250W on boot, so we can rule out a power limitation. Troubleshooting: Tried moving the Mini SAS cables around to different drives, in case any are faulty. Tried reseating the HBA card Moved the PSU power cables around to ensure a single connection is not at fault. Past parity checks that had drives disabled: 7/1/23 - 7HK5666F 6/2/23 - 7SGHH6BC 3/31/23 - 7SGJHT7C 3/5/23 - 7HK5666F 12/29/22 - 7SGHH6BC Looking for any other suggestions or troubleshooting steps. Thanks Edited July 16, 2023 by Abell Added additional points Quote
JorgeB Posted July 4, 2023 Posted July 4, 2023 Since drives connect to both controllers failed the PSU would be main main suspect. Quote
Abell Posted July 5, 2023 Author Posted July 5, 2023 17 hours ago, JorgeB said: Since drives connect to both controllers failed the PSU would be main main suspect. Thanks, I'll try replacing with a new PSU in case that's the culprit. It's just strange that I can run a drive rebuild, which should spin up all of the drives using up power, and it completes without issues. Maybe it's minute power fluctuations that the rebuild can recalculate, while the parity check would fail? Quote
JorgeB Posted July 5, 2023 Posted July 5, 2023 Rebuild should stress the PSU in the same way, so both would be expected to fail similarly. Quote
Abell Posted July 16, 2023 Author Posted July 16, 2023 @JorgeB Replaced the PSU with a 750W "be quiet Dark Power 13" with Titanium efficiency. The old Cooler Master was just Gold, hoping this will resolve the issue and increase it's lifespan, especially since it runs 24/7. I'll keep an eye on parity checks and will report back if all is good. 1 Quote
Abell Posted October 12 Author Posted October 12 Update: Still seeing random drives being disabled, although less often. Last time it happened was in October 10, 2024, 8 days after party check. So parity must not be the cause. The time before that was on April 15, 2024. Motherboard has 4 x SATA 6Gb/s connectors. Will be moving drives to fully use the SATA ports, leaving 3 drives in the SuperMicro AOC-S2308L LSI card. The only culprit left after the PSU upgrade is the LSI card. I had zip tied a small NOCTUA fan to the card's heatsink and run it at full speed to avoid overheating a year ago. Since the last 2 occurrences happened to the same Parity drive while connected to the LSI, I'll be moving that one to the motherboard SATA connector. Quote
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.