DawgMeister2000 Posted January 5 Share Posted January 5 Hi everyone, I apologise in advance, I'm not very technical, but that's the reason I need your help please. My array has been working fine for many months: - 1x parity drive (18TB) - 1x data drive (12TB) - 1x cache drive Last week, I added a new clean drive, so another 12TB, so 2x 12TB data drives. Everything worked fine afterwards for several days. However, today the following sequence of events happened: 1) Got a notification with a partity error 2) Parity was automatically disabled 3) I rebooted clean, to see if that would fix it 4) Upon reboot, partity drive is gone (not in unassigned either) Rebooted again and then the parity drive shows up as an UNASSIGNED drive. So I stop the Array. Try to assign the drive as PARITY, but as soon as I select that drive, it disappears from the selection menu. It's still showing as UNASSIGNED, saying I need to FORMAT it. So I try that, but formatting fails. I open up the machine, check all cables. I find a post on the forum saying that I should use MOLEX to SATA adapters for powering the drives, so I change all of the cables to MOLEX and reboot. Same issue. I have rebooted this machine multiple times now, played with cables. Always the same: - the drive shows up as UNASSIGNED - when I try to assign it as PARITY it won't let me Have tried to check logs (but I don't really understand them) and done one of those SMART diagnostics, which (amongst other things) said "scsi error medium or hardware error (serious)". Not sure what that means. I have also run the DIAGNOSTICS, given that's what I've seen people do on this forum. Both attached. Pls help. holyteepot-diagnostics-20240105-2302.zip holyteepot-smart-20240105-2256.zip Quote Link to comment
Solution itimpi Posted January 5 Solution Share Posted January 5 Looking at that SMART report and the syslog entries it looks likely that the drive really has failed. you could try and run an extended SMART test on it (possibly on another system) as if that fails it is normally enough to get a RMA (assuming the drive is still under warranty). Quote Link to comment
DawgMeister2000 Posted January 5 Author Share Posted January 5 Thanks. Is there any way I can validate whether the drive is the problem? All drives are fairly new, about 1-2 years old and have been treated with care. They are WD RED PRO / PLUS. Quote Link to comment
trurl Posted January 5 Share Posted January 5 31 minutes ago, DawgMeister2000 said: Is there any way I can validate whether the drive is the problem? 10 hours ago, itimpi said: run an extended SMART test on it (possibly on another system) as if that fails it is normally enough to get a RMA (assuming the drive is still under warranty). Quote Link to comment
DawgMeister2000 Posted January 5 Author Share Posted January 5 ok, so I've tried to run the extended SMART test, but I'm not sure it's working. When I click on the button, it only takes 2 seconds and then the button goes back as if nothing happened. I can click on it again and again, the result is always the same: I then tried the smartctl command directly: Not sure if I did this correctly... but this also doesn't seem to work. Sorry for being so useless. 😕 Quote Link to comment
trurl Posted January 5 Share Posted January 5 Post a new SMART report for the disk Quote Link to comment
DawgMeister2000 Posted January 5 Author Share Posted January 5 5 minutes ago, trurl said: Post a new SMART report for the disk After clicking on the report buttons, it still says "No self-tests logged on the disk". When I click on the DOWNLOAD REPORT button, I get a 404 error: Quote Link to comment
DawgMeister2000 Posted January 5 Author Share Posted January 5 1 minute ago, trurl said: Will it do the Short test? Yesterday it did. (I had uploaded it in my original post.) Today it doesn't. Not sure why. Quote Link to comment
trurl Posted January 6 Share Posted January 6 Are you testing the disk on the same server it was on? Testing on a different computer might clarify whether the problem is the drive or something else in the server. Quote Link to comment
DawgMeister2000 Posted January 6 Author Share Posted January 6 ok, update: I have switched the power cables to my drives and now it's letting me run the SMART report. Attached. holyteepot-smart-20240106-1118.zip Quote Link to comment
DawgMeister2000 Posted January 6 Author Share Posted January 6 21 minutes ago, trurl said: Are you testing the disk on the same server it was on? Testing on a different computer might clarify whether the problem is the drive or something else in the server. I'm using the same server. I only have this one, nothing else. Quote Link to comment
DawgMeister2000 Posted January 6 Author Share Posted January 6 I have done another EXTENDED SMART report (it looks the same to me). Attached. holyteepot-smart-20240106-1153.zip Quote Link to comment
DawgMeister2000 Posted January 6 Author Share Posted January 6 I have switched SATA ports and power cables for all drives. The other drives work fine, but this drive fails no matter what port I use. Quote Link to comment
trurl Posted January 6 Share Posted January 6 If you have space on the first data disk, you could copy anything from the new data disk then use that new disk as parity. If that will rebuild as parity successfully that would be a good test that nothing else in your server is to blame for the problems with that old parity disk. I'm pretty sure the disk is to blame. I would think the SMART reports you already have should be enough to get it replaced but of course I don't speak for the manufacturer. Quote Link to comment
DawgMeister2000 Posted February 3 Author Share Posted February 3 Hi All, UPDATE: it was indeed the disk that was broken. I was able to return the hard drive under warranty (less than 2 years old) and replace it with a new one. Once the new disk had arrived, I formatte it and rebuilt parity. Now everything has been working perfectly fine for a week or so. No data loss. Thanks again to everyone in this thread for helping me out. Those SMART reports didn't mean anything to me and I wouldn't have concluded that the disk was broken without your help, because it was so new. Really appreciated. Case closed. 🙂 Quote Link to comment
trurl Posted February 3 Share Posted February 3 4 hours ago, DawgMeister2000 said: I formatte it and rebuilt parity No point in writing a format to a disk that is going to be completely overwritten. This sort of misunderstanding of "format" can lead to data loss when you need to rebuild a data disk. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.