Jump to content
falconexe

[SOLVED] Double Data Drive Failure During Parity Check (Dual Parity)

104 posts in this topic Last Reply

Recommended Posts

I don't see any good way to dispense with drive serial numbers. DRIVE22 wouldn't have any meaning in many of the situations we need to deal with. There are disks that aren't assigned, disks that have been or will be used to replace assigned disks, disks that we may not know how they were or should be assigned. And many times any misunderstandings when discussing these things will have some chance of lost data. And I'm not convinced I've ever heard of a good reason to keep drive serial numbers secret.

Share this post


Link to post
1 hour ago, falconexe said:

Where should I post such requests?

Feature Requests section

1 hour ago, falconexe said:

drive serial numbers

My opinion is that even if you have them, it doesn't do you any good.  No different than the VIN number on your car.  Can't get warranty work done on another car by giving them a vin number that doesn't belong with the physical car.  But to each their own...

Share this post


Link to post

Thanks guys. It goes back to the SNs can be traced to purchase location and then tied to a business. Typically that kind of information is considered confidential/protected/proprietary and is not normally out in the wild due to various reasons. I get both sides... Thanks for responding. Finally got the @Squid in here ha ha. I feel special. Seen him and @trurl all over the place over the years. I've been a sleeper user reading forums for a very long time, but never felt the need to get involved until this recent mishap.

Share this post


Link to post

So I have an update for everyone. Once the Old Disk 04 (sdad) finished preclearing successfully, I rebooted the server. Upon logging in I was greeted with this:

 

398636877_PreclearStatus-20200113-03.PNG.ae9588130d4fee9ca4096edb2452daa0.PNG

 

And to take some good advice from @Squid and @trurl, you'll notice the serial numbers are not Photoshopped out. That was getting tedious anyway 🤣

 

This was the Old Disk 22 that ended a parity check in 2,000+ errors, passed a HD Tune Pro Full Error Scan, but Failed an UNRAID Preclear during the post-read process. So, it was having some major issues after-all. Can't understand why the SMART report was coming back clean until after this last reboot. For the last 6 years, EVERY SINGLE TIME, a disk would fail, I would see a failed SMART report first indicating pending sectors or reallocated, etc..

 

This drive was ghosting SMART health during 2 full health checks inside Windows and in UNRAID. I'm sure the disk power issues have a correlation as well.

 

Anywho, this drive is TRASH and I'll be executing a 35 pass wipe with some "other" tools ha ha. We'll see if it can handle that type of onslaught. Then I will physically destroy the drive with a drill press for good measure.

 

1531890015_SMART-OLDDISK22-20200113.thumb.PNG.aebc957398022eade975579e863269b0.PNG

 

I guess we can mark this as SOLVED. By the way, since this is my first thread. Do I go change the title to "SOLVED" or does a MOD do that for me?

 

Thanks everyone.

Share this post


Link to post

If I may suggest something; try running it trough WD Lifeguard and do an erase and then a full extended test. Try it again after that.

I've had success in repurposing "failed" drives for my backup box this way. One has been running another few years since 'failing' originally. I just don't take any chances with my production server.

I dunno, it's just worked for me a lot, and it's only like 12-14 hours to run it through in an external dock.

Share this post


Link to post

You mentioned that these drives were shucked Seagate Backup Plus? My friend was using drives similar to these with the 3.3v issue. Long story short, he had drives dropping off like this as well (not sure on the diagnostics on his side though). We figured out in his case it was due to cheap electrical tape with imperfect installation over the pins. After heating up sufficiently while spun up a long time (like during parity checks) they would slightly change shape and cause the drive to trip off. Probably not your issue, but figured I would mention it just in case. 

 

Also, I am in the planning stages of building a large storage server like yours. I was wondering if you could tell me what case you are using, and what you think about it? Been back and forth with a couple different options. 

Share this post


Link to post
Just now, jebusfreek666 said:

Also, I am in the planning stages of building a large storage server like yours. I was wondering if you could tell me what case you are using, and what you think about it? Been back and forth with a couple different options. 

I believe he mentioned using Storinators. Clearly falcon, other than a good taste in games (at least the two out of three) has seen too much Linus Tech Tips.

https://www.45drives.com/products/storage/

Share this post


Link to post
11 minutes ago, jebusfreek666 said:

You mentioned that these drives were shucked Seagate Backup Plus? My friend was using drives similar to these with the 3.3v issue. Long story short, he had drives dropping off like this as well (not sure on the diagnostics on his side though). We figured out in his case it was due to cheap electrical tape with imperfect installation over the pins. After heating up sufficiently while spun up a long time (like during parity checks) they would slightly change shape and cause the drive to trip off. Probably not your issue, but figured I would mention it just in case. 

 

Also, I am in the planning stages of building a large storage server like yours. I was wondering if you could tell me what case you are using, and what you think about it? Been back and forth with a couple different options. 

That issue sounds insane with the tape. My temps are 23-26C during parity checks, so I don't think it was that, but good to know. Thanks.

 

If you want to know about my specific build, go check out the below thread. Tons of pics. Not for the faint of heart on your wallet, but after my Norco 4224 case hitting 50+C, I LOVE the STORINATOR. Specs are insane. Around $6K not including disks ha ha.

 

 

13 minutes ago, Froberg said:

I believe he mentioned using Storinators. Clearly falcon, other than a good taste in games (at least the two out of three) has seen too much Linus Tech Tips.

https://www.45drives.com/products/storage/

Not a fan of "Linus" LOL. And actually found 45 Drives through google after moving on from a great custom builder called Grean-Leaf

http://greenleaf-technology.com/ He's out of Ohio, and was really helpful when I first got into UNRAID back in 2014.

 

11 minutes ago, jebusfreek666 said:

Ah, completely glossed over that in the sig. Thanks. 

How could you miss this LMAO? It's a frackin RAINBOW 🌈 😂

 

Share this post


Link to post
21 minutes ago, Froberg said:

Clearly falcon, other than a good taste in games (at least the two out of three)

Agreed on MassEffect. My favorite video game franchise EVER. Bioware is in a sad state these days. Hoping Casey Hudson can deliver the magic again with ME4. 🙏 ...And Where the Crap is my ME Trilogy Remaster EA? Take my Money. Easiest Cash Grab in history and they won't do it. 🙄

 

Anyway, back to UNRAID talk ha ha.

Edited by falconexe

Share this post


Link to post
1 minute ago, falconexe said:

Agreed on MassEffect. My favorite video game franchise EVER. Bioware is in a sad state these days. Hoping Casey Hudson can deliver the magic again with ME4. 🙏

Actually to be fair, one of the best bits of mass was in the third one. That DLC where you were running around the citadel with your entire party and fighting your own clone. That had real bioware magic. (and Boo!)

Personally Baldurs Gate series trumps ME for sheer storytelling. Isometric is so underrated. ;)

I haven't even bothered with Andromeda just out of sheer disappointment.

Share this post


Link to post
Just now, Froberg said:

Actually to be fair, one of the best bits of mass was in the third one. That DLC where you were running around the citadel with your entire party and fighting your own clone. That had real bioware magic. (and Boo!)

Personally Baldurs Gate series trumps ME for sheer storytelling. Isometric is so underrated. ;)

I haven't even bothered with Andromeda just out of sheer disappointment.

Funny enough, I am the hugest ME fan, and I never could get myself to play the Citadel DLC. Cause then it would truly be over. That was like 5 years ago. I am keeping a 4th full play through in my back pocket for nostalgia sake and will run through EVERYTHING one last time. Gotta finish The OuterWorlds first (which is fantastic BTW and I really don't like Fallout games per se). I played Andromeda for like 40 hours and my save got corrupted and I just could not get myself to start over. It was pretty bad. Those Faces Tho...

Share this post


Link to post

So the Old Disk 4 "sdad" (3 parity errors) that just passed the preclear process and is ready to be added back to the array. However, I was checking the SMART reports, and everything looks perfect except this bit under Error History. Should I be worried about any of this?

 

1865045133_SMART-OLDDISK04-20200113.thumb.PNG.30b9614dc0490d0ac6b105a44e81e6da.PNG

Edited by falconexe
Typos

Share this post


Link to post
31 minutes ago, falconexe said:

Should I be worried about any of this?

Those errors are usually not a disk problem, I wouldn't worry for now.

Share this post


Link to post
4 minutes ago, johnnie.black said:

Those errors are usually not a disk problem, I wouldn't worry for now.

Ok thanks so much for getting back to me. I'll proceed with formatting the drive in XFS and remounting.

Share this post


Link to post
5 hours ago, falconexe said:

Do I go change the title to "SOLVED" or does a MOD do that for me?

You can edit your first post in the thread to change the title.

Share this post


Link to post
On 1/13/2020 at 7:49 PM, falconexe said:

Can anyone tell me what that error means and what you think is going on with this Disk that disappeared? Also this disk is in a totally different slot than before (was 22, now is 27). So I am really thinking it is NOT my UNRAID Hardware and actually the disk itself.

 

 

It appears to have powered down again after it failed the post-read. The log is below:

 

The 45Drives enclosure really a good case, but I doubt a single power supply can fullfill the 5v power requirement for 30 HDD and System? Does cables ( data/power ) have several connect point instead a single end to end.

 

An easy way to check this was use a multimeter ( with Min / Max recording ) to check the 5v at the most far end-point, just load all HDDs and check the min reading.

 

Could you provide the spec. of the power supply ?

 

If problem on disk itself ( may be ), why it can success preclear without problem > 50s hrs, so I prefer believe problem not on disk.

Edited by Benson

Share this post


Link to post
6 hours ago, Benson said:

 

The 45Drives enclosure really a good case, but I doubt a single power supply can fullfill the 5v power requirement for 30 HDD and System? Does cables ( data/power ) have several connect point instead a single end to end.

 

An easy way to check this was use a multimeter ( with Min / Max recording ) to check the 5v at the most far end-point, just load all HDDs and check the min reading.

 

Could you provide the spec. of the power supply ?

 

If problem on disk itself ( may be ), why it can success preclear without problem > 50s hrs, so I prefer believe problem not on disk.

Doesn't the PSU come with the case, as in, 45Drives made sure of this?

Share this post


Link to post

If I recall, the PSU on the Storinator Q30 is a Corsair 700 Watt. Once I get some time, I can crack it open and confirm. But yeah, that is well over the 150W requirement at 5V X 30 Drives, with plenty of overhead for Dual CPUs, MB, RAM, etc...

 

Also, that drive failed another round of HD Tune Pro Error Checks showing extreme DAMAGE to sectors all over the place. Lit up RED blocks all over like a Christmas tree. It's toast.

 

I am eviscerating it with a 35 Pass Gutmann wipe as we speak just to take out some aggression. 😒 Well worth the electricity...

 

I doubt any of the passes will actually finish with so many issues, but at least it makes me feel better. 😂

Edited by falconexe

Share this post


Link to post

After adding the old Disk 04 back into the array with a clean bill of health, I just Friggin hit 206TB usable. Approaching .25PB 😍

 

Array_20200114.thumb.PNG.6b28ffe8c0a40476a1aac7cea56c0e1f.PNG

 

 

Share this post


Link to post

Now you just need to get the rest to 14 TB and you'll be on your way to .5! ;-)

Share this post


Link to post
On 1/15/2020 at 1:18 AM, rl2664 said:

No Raid 1 Cache pool?

I had X2 1TB drives in a pool for a bit, but put one of them into my other UNRAID server. I'll add another one soon.


Until then, I have all critical files being written directly to the array and larger (already backed up) files being written to the Cache. My mover runs every single night, so the risk is very low for me to lose something in cache, and if I did, I have backups.

Edited by falconexe

Share this post


Link to post
19 hours ago, Froberg said:

Now you just need to get the rest to 14 TB and you'll be on your way to .5! ;-)

I just can't wait to pick up my 1PB thumb-drive at the local Best Buy.

Share this post


Link to post
3 hours ago, falconexe said:

I just can't wait to pick up my 1PB thumb-drive at the local Best Buy.

It's probably going to happen eventually. Hoping transfer speeds go up too in that case.

Share this post


Link to post

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.