Jump to content

I have a failed drive but won't have a replacement for probably a week - leave it disabled? or remove?


Recommended Posts

I am going to RMA the drive but it will take longer than a week to receive the replacement. I also placed an order on Amazon for a larger drive but that will take a week as well since they don't currently have inventory.

 

Is it safer to remove the drive now? I have enough space to allow for the drive to be removed

Link to comment

Are you sure the drive really has failed?   Often a drive gets disabled (marked with a red ‘x’) because a write failed for some reason but the dtive is OK.  We might be able to give a view if you supply your system’s diagnostics and mention which drive it is.

 

Unraid will stop using a file that is disabled so when you remove it is up to you.

 

 

Link to comment

For some time I've been seeing this error...

 

UDMA CRC error count

 

For a few months. Yesterday was the first time I got a write failed and the drive was disabled as you mentioned. I've already tried swapping out the cable / making sure everything is ok. This is also a refurb drive sent as a replacement for another drive that failed.

Link to comment

Looks like you still have connection problems with that drive, no SMART report for it and syslog is showing connection problems on some port but can't tell which disk because the logs in your diagnostics are full of OOMs (looks like plex) and don't go back far enough to identify the disk on that port.

 

Check all connections, SATA and power, both ends, including splitters. Then start the array and post new diagnostics.

Link to comment
13 hours ago, andyd said:

no CRC errors using the same cables

The CRC errors are recorded in firmware of each drive so if you swapped disks then you get whatever CRC errors are recorded in the firmware of the new drive and if you add the same drive back into the array it will still have the same CRC errors that were already recorded. All of the SMART attributes are like that.

 

And not all connection problems will be recorded as CRC errors. Obviously if a drive isn't connected at all it can't know anything about that. CRC errors just means there was some inconsistency in the data the drive received, the data doesn't pass the checksum test so it knows it wasn't transmitted accurately.

 

 

Link to comment

Look on the Dashboard page to see if there are any SMART warnings for your disks. You would have seen that CRC warning there when the drive was installed but maybe you didn't notice. You can click on the warning and Acknowledge it and it won't warn again unless it increases.

 

I usually just Acknowledge the occasional CRC warning. If it continues to increase you need to see about fixing the connection. Most other SMART attributes that Unraid monitors for warnings may be serious problems requiring disk replacement.

Link to comment

Sorry just getting back to this as I had taken out the drive and replaced it with a new larger drive...

 

1. The new drive is using the same cables that the original CRC error drive was on - no issues so far

2. After going through some set up stuff I wanted to do, I added back the original drive to get it back onto the array - so different cables / sata port.

 

Immediately after starting the server I got a message about CRC error count on the original drive. And yeah, CRC warnings is something that I've been seeing since last year on this drive and I kept acknowledging the error. The write error and then disabling of the drive was the first time of seeing that happen.

 

Seems like something is up with the drive? Considering the change in cables / port. But yeah SMART doesn't report any issues.

Edited by andyd
Link to comment
2 hours ago, andyd said:

Immediately after starting the server I got a message about CRC error count on the original drive. And yeah, CRC warnings is something that I've been seeing since last year on this drive and I kept acknowledging the error.

The count of CRC failures is stored on the hard drive itself.  This count can not be reset to zero by any means known to mortal men.    99.99% of the time, a CRC error is caused by something besides the drive.  That is why you can tell Unraid to ignore them.  (this function is provided on the Dashboard tab as I recall...)  Basically, it tells Unraid to ignore the count until it increases.  An occasional one or two CRC errors are completely acceptable as they result in no actual data loss/error.  They just slow down data transfer as the data has to be resent until there is no error in transmission.  (Most common causes are bad cables or 'loose' SATA connectors.)

 

I have old disk with 72,000+ errors on it (result of a cheap Chinese SATA card...) that had the last CRC error back in 2018. 

Edited by Frank1940
Link to comment

Well, ruled out cables and loose connectors unless the connectors can also be the ones on the drive.

 

And I get 1 or 2 - it's happened at least 5 times till the last time where the drive was then disabled because of write error. I guess my concern is that my drive is still under warranty for another 6 months - is it worth the risk ignoring the errors and having to keep acknowledging / dealing with a disabled drive??

Edited by andyd
Link to comment

You have that option of RMA'ing the drive.  (It is my understanding that you almost always get back a refurnished drive...)  However, while it has been a long time since I have RMA'ed a drive, I have never been questioned about why I was returning and they never looked at anything but the shipping/RMA labels on that package when they received it.  I know this because the replacement always shipped the same day.  (I have assumed that they realize that removing a Hard drive is not a trivial task that one would simply do on a whim.  If the drive is not found to be defective on their inspection, are they really prepared for flack/bad-publicity that a pizzed-off customer might make if they denied an RMA claim???)

Edited by Frank1940
Link to comment
16 hours ago, trurl said:

Did you check power connections, both ends, including splitters?

It's on a different power connection as well - I swapped it into an entirely different position where all cables to it were different.

 

 

16 hours ago, Frank1940 said:

You have that option of RMA'ing the drive.  (It is my understanding that you almost always get back a refurnished drive...)  However, while it has been a long time since I have RMA'ed a drive, I have never been questioned about why I was returning and they never looked at anything but the shipping/RMA labels on that package when they received it.  I know this because the replacement always shipped the same day.  (I have assumed that they realize that removing a Hard drive is not a trivial task that one would simply do on a whim.  If the drive is not found to be defective on their inspection, are they really prepared for flack/bad-publicity that a pizzed-off customer might make if they denied an RMA claim???)

 

Oh the RMA was already approved. I just have to pay the $25 for an advanced replacement though I guess I don't need to do advanced anymore since I already removed the drive from the array hmm

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...