Jump to content

Want to add a brand new disk as Parity 2 - Reporting errors and goes into error state - SMART is fine


Go to solution Solved by Matheew,

Recommended Posts

Hello unRAID Community!

 

I've bought a brand new WD Red 12TB disk which I will use to replace my current 8TB parity drive. I've read a lot of different ways to do this but I decided to go for this method:
 

  • Stop the array
  • Install new drive.
  • In main assign the new 12TB drive to parity slot 2
  • Start the array and allow parity to rebuild
  • Stop the array
  • In main unassign the old 8TB parity drive from parity slot 1
  • Start the array

 

However, when I've assigned the new 12TB drive to parity slot 2 and started the array, things go south. The parity rebuild pretty much instantly pauses and unRAID gives me the error messages you can see in the attached file. The new 12TB drive reports errors and unRAID tells me that the new drive is in an error state.

 

unraid-parity-error.thumb.JPG.b970c622f246a17a88a3befc1cfd75c3.JPG

 

In order to get back to normal I:

  • Cancel the paused parity build.
  • Stop the array
  • Remove the 12TB drive from the parity 2 slot.
  • Start the array

 

unRAID then tells me everything is fine:

unraid-array-repair.thumb.JPG.21a1f7faa2cb464a19b6e049e57180b8.JPG

 

If I then repeat the process above the error occurs again in the same way, and that is where I am at right now.

 

First thought is obviously a broken disk, but that doesn't seem too likely since the disk is brand new. So I ran an extended SMART-test on the disk yesterday and the disk reported no error at all, you can find the SMART report in the attached Diagnostics file.

 

System:

unRAID version 6.8.3
Logic 24-bay Hot Swap with IBM M1015

 

Thanks in advance!

unraid-diagnostics-20220818-0848.zip

Edited by Matheew
Link to comment

Likely either the disk or the port / cable it's connected to has issues. Probably the latter since there are I/O errors in the log.

 

1 hour ago, Matheew said:

doesn't seem too likely since the disk is brand new.

 

"Infant mortality" is a thing on new devices, that's why it's typically recommended to leave new disks unassigned and run a preclear on them before adding them to the array just to exercise them and see if they're going to die outright or there are other communication-related issues. 

Edited by Kilrah
Link to comment

Thanks for the reply!

 

My M1015 is now updated to firmware 20.00.07.00, the issue is still the same.

I connected the HDD directly to the motherboard (obviously not a feasible long term solution) and the issue is gone.

 

What conclusion can we draw from this? I would not rush to the conclusion that the M1015 is broken since I have four drives that have been working perfectly for more than a year. The difference here is that I've never before have connected a drive larger than 8TB to the HBA, I've googled but could not find anything that would indicate that there is a max HDD size for the card in question.

 

Any thoughts?

Edited by Matheew
Link to comment
12 minutes ago, JorgeB said:

I would suspect some compatibility issue between the HBA and that disk model.

 

Hmm, it is a WD Red, so no obscure brand or anything.

 

6 minutes ago, Frank1940 said:

Why not?

 

A number of reasons, but the primary one being the fact that my chassi is using SAS backplanes to install disks, which is how I want it for expansion possibilities. There is no way to mount a disk permanently and connect it directly to the MB.

Link to comment
2 hours ago, Matheew said:

my chassi is using SAS backplanes

 

I seem to recall that occasionally backplanes have gone bad.  (I have one of the nine hot swap backplanes in my Media server where an installed disk throws CRC errors.  Now, I have not trouble shot it to determine if it is the cable or the backplane or the controller lane as I don't need it at this point...)

Link to comment
2 hours ago, Frank1940 said:

 

I seem to recall that occasionally backplanes have gone bad.  (I have one of the nine hot swap backplanes in my Media server where an installed disk throws CRC errors.  Now, I have not trouble shot it to determine if it is the cable or the backplane or the controller lane as I don't need it at this point...)

 

Well... I currently have two out of six SAS backplanes connected to the M1015 and disks connected to each of them without issue, and moving the new HDD from one backplane to another does not fix the issue. The probability that both backplanes are broken but none of the currently installed disks are affected seems very unlikely.

Link to comment
  • 2 weeks later...

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...