(RESOLVED) UDMA CRC error on 3 drives simultaneously


FreeMan

Recommended Posts

I'm migrating some older drives from my primary array to replace even older, smaller drives in my backup array. I've completed updating parity from a 2TB drive to a 3TB drive. I then replaced one of the 1 TB drives with the old 2TB parity. So far, so good.

 

This evening, I physically removed the 1TB replaced drive, and pulled another 1TB drive, replacing it with a 2TB pulled from the main server. When I powered the backup server back on, it showed 2 drives missing instead of 1. I calmly powered down and checked all power & SATA cables. Nothing seemed loose, but when I powered it back up, the same 2 drives were missing.

 

I changed plans & in the Array Devices page, I left the 1TB that was showing and assigned the newer 2TB drive to replace the one that was totally missing. I started the array and a rebuild commenced as expected. I chalked it up to a possible total drive failure, but I wasn't worried since I had the spare drive and am replacing that poor old 1TB drive anyway. (The 1TB that I've already pulled had been spinning for 10.5 years, the other two for north of 5 years each. They've done yeoman's service and deserve an honorable discharge and a quiet retirement.)

 

About 45 minutes into the disk rebuild now, and I just got a couple of notifications on Pushover indicating that "UDMA CRC error count is 1" on Parity, Disk 1 and Disk 2. Disk 3 is the one currently being rebuilt, Disk 4 is the one remaining 1TB drive, and Disk 5 is the prior 2TB parity drive that's now doing duty as a data drive.

 

Are these critical errors?

Does it mean that I've got 3 "newer" drives dying on me, just as I was replacing the oldest of old drives?

Should I consider panicking?

 

This is my backup server that is going to (but hasn't yet) replace my CrashPlan subscription, but at the moment, I've still got everything on CP, so I'm not horribly worried.

 

Diagnostics are attached.

backup-diagnostics-20190910-0039.zip

Edited by FreeMan
Link to comment

Thanks for that @Harro.

Looks like it's the cables (I didn't reseat everything, I just fiddled with the Drive 3 & Drive 4 connectors since those were the two that didn't show up when I powered back up)

Or, it's the power supply (I'm running the same Corsair CX430, but I've only got 6 drives and an SSD cache)

Or, the drives actually are dying.

 

Well, the drive rebuild seems to be proceeding at what is a reasonable pace for this box (35MB/s ain't reasonable, but this is really old hardware (AMD FX4100 anyone?), and I plugged one drive back in via a USB2.0<->SATA connector for the time being). I'll let it finish the rebuild, then take the box down again when it's done tomorrow and pull and reseat all the cables and see what happens.

 

To be continued...

  • Like 1
Link to comment

After the rebuild of Disk 3 on the new(er) 2TB drive, I powered the system off, opened the side of the case and realized I didn't know which drive was which. :(

 

I booted the server back up and it came up without any CRC errors.  I took a couple of quick pics so I knew which serial number I needed to pull, powered down, swapped out the drive and have rebooted again. Still no CRC errors.

 

It's currently rebuilding the newest and last upgraded drive.

 

My guess is that one (or more?) of the SATA cables just wasn't seated fully. I did press them all in against the drives while I had the box open, so I'm going to cross my fingers and assume that was the issue. The drive reconstruction is positively screaming along at 85MB/s right now (now that everything is 2TB or greater, that increased speed does make some sense). I'll put this issue to bed now and only be back if something else goes wonky at some point in the future.

  • Like 1
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.