guyfwoodward Posted July 31, 2015 Share Posted July 31, 2015 Hi folks, I'm really hoping I am wrong, but it looks like Unraid has managed to destroy one of my array disks within about the first 24 hours of (fairly sedate) use. Any help identifying the problem would be much appreciated. I had just finished configuring a 3-drive array (3x 3TB WD Red plus an older HDD pulled from a desktop for cache). Parity check was underway but I don't think it had completed, and I was working on getting Plex etc installed and adding content. All of a sudden I started getting heat warnings from both the cache and one of the two non-parity disks in the array, then I got an error alarm on the disk, and it now shows up as unmountable. The warnings were as follows: Event: unRAID Disk 1 temperature Subject: Warning - Disk 1 is hot (45 C) Event: unRAID Disk 1 error Subject: Alert - Disk 1 in error state (disk dsbl) Event: unRAID array errors Subject: Warning - array has errors Event: unRAID Parity sync: Subject: Notice - Parity sync: finished (338 errors) I have run a SMART diagnostic on the drive and it appears to be dead - namely: Self-test execution status: 121 The previous self-test completed having the read element of the test failed. The parity sync hadn't caught up with what I had done so I think I have lost my data, and apparently written off a fairly expensive drive under pretty minimal loading, so you will understand I'm a bit annoyed. Have I indeed written off the drive, and if so how on Earth do I prevent it from happening again? I'm amazed I have any cooling issues as the drives are mounted in a 2U server case that has a whole bank of fans for airflow - and surely Unraid is smarter than to keep pushing data down a drive that is reporting high temperatures? As I say, hoping I am wrong - any advice would be very welcome!! Guy Link to comment
trurl Posted July 31, 2015 Share Posted July 31, 2015 Infant mortality. Electronics often will fail early or will work until obsolete. Did you test the drives by preclearing them? Go to Tools - Diagnostics and post the result. Link to comment
reggierat Posted July 31, 2015 Share Posted July 31, 2015 I can understand that you are upset but this is what warranty is for. You cannot blame software for a hardware failure Link to comment
gubbgnutten Posted July 31, 2015 Share Posted July 31, 2015 Running preclear on new disks before adding them to the array has multiple benefits, one of them is to help avoid problems like yours by triggering marginal drives before actually trusting them with data. The other great thing is that array expansion later on will be more or less instantaneous with a precleared drive. You appear to be under the impression that the system was under relatively light load. I disagree - If parity was being built, it means that all drives were being used at the same time, data drives constantly read as fast as possible and parity drive written to in corresponding fashion. In addition to this, you added content to Plex. I would assume that this means copying files to the array and then having Plex scan the files and generate index files in the background. Aside from stressing the I/O system even more during parity generation, this also puts the CPU to 100% use by Plex. Maxing out both the hard drives and the CPU would certainly explain rising temperatures… That said, this is not a problem for a healthy drive in a decently cooled system. If your components are overheating, you have a hardware problem of some kind. Since both your cache drive and one of the data drives had temperature warnings, I would first check that the fans are fully functional and that nothing is obstructing the airflow. What was the ambient temperature? Were the drives with temperature warnings located next to each other? Link to comment
trurl Posted July 31, 2015 Share Posted July 31, 2015 Also, while I don't like my drives to get to 45, that default setting for the temp warning gives you a pretty safe margin. Most likely infant mortality like I said. If you haven't already learned about preclear, see search tips in my sig. If you didn't preclear, I would suggest starting over with preclearing all your drives and return any that don't pass. It is very important that all your drives be trustworthy since every bit of all the others will be required to rebuild if one of them fails. Then take things a little more slowly (preclear will give you a lesson in patience). Link to comment
Recommended Posts
Archived
This topic is now archived and is closed to further replies.