What is the proper way to replace these drives?

May 28, 20188 yr

3 drives are showing smart warnings so I've got 3 more 4tb to replace them. All existing drives are 4tb.

1 of the 2 parity drives are failing.

2 of the 5 raided drives are failing.

What is the best procedure to replace them all? One at a time? Which one should I start with? Do the parity drive first then replace the other two drives at the same time?

Quote

May 29, 20188 yr

Tools - Diagnostics, post complete zip

Quote

May 29, 20188 yr

Author

I can do this later but is there a reason for it? Not asking for troubleshooting just advice on what order to replace and rebuild.

I'll guess the way to go about it is replace the parity, let that build. Replace 1 or both of the data drives and let that rebuild.

Quote

May 29, 20188 yr

36 minutes ago, AJ Ouellet said:

Not asking for troubleshooting just advice on what order to replace and rebuild.

Without knowing what condition the drives are in, there is no way to know what the safest order of replacement is going to be. You say they are failing, and have smart warnings, but that is a very generic statement. We have no way of knowing what the actual condition is.

With 3 suspect drives and 2 parity drives, you will lose data if one of the other drives is unable to be read correctly while reconstructing the one you replace.

Proceeding blindly is not a good idea, and you asked for help, which is good. You deny us the ability to analyse the problem, which leaves us just as blind as you as to what to do.

Quote

May 29, 20188 yr

@AJ Ouellet

I would test out the new drives, and then add them as UDs.

Then you can copy the data from the drives you want to replace to the new drives in parallel. Optionally you can compare checksums to ensure everything copied correctly.

Then you can do a new config, excluding the failing disks and including the new ones. Parity will then build. The three old drives you can hang on to as backups.

This is a lot faster and handles failure scenarios better IMO.

Quote

May 29, 20188 yr

Author

6 hours ago, jonathanm said:

Without knowing what condition the drives are in, there is no way to know what the safest order of replacement is going to be. You say they are failing, and have smart warnings, but that is a very generic statement. We have no way of knowing what the actual condition is.

With 3 suspect drives and 2 parity drives, you will lose data if one of the other drives is unable to be read correctly while reconstructing the one you replace.

Proceeding blindly is not a good idea, and you asked for help, which is good. You deny us the ability to analyse the problem, which leaves us just as blind as you as to what to do.

Understand.

Parity drive has 600+ crc errors. 1 of the data drives has 300+ and the last one has only 1. I'll still try to post the diagnostics later tonight when I'm home with the server.

Quote

May 29, 20188 yr

1 hour ago, AJ Ouellet said:

Understand.

Parity drive has 600+ crc errors. 1 of the data drives has 300+ and the last one has only 1. I'll still try to post the diagnostics later tonight when I'm home with the server.

The CRC errors are normally not disk errors - they are caused by failed transfers between disk and controller and are quite often caused by the cable.

It's some of the other numbers that are way more interesting when it comes to disk health.

Quote

May 29, 20188 yr

Author

21 minutes ago, pwm said:

The CRC errors are normally not disk errors - they are caused by failed transfers between disk and controller and are quite often caused by the cable.

It's some of the other numbers that are way more interesting when it comes to disk health.

Two of the drives should be replaced anyways as they have 4+ years power on times. Will post diag asap.

Quote

May 29, 20188 yr

5 minutes ago, AJ Ouellet said:

Two of the drives should be replaced anyways as they have 4+ years power on times.

Not an indicator of health. I have many drives still in perfect working order with twice that many working hours.

When you attach your diagnostics zip file, also list your general configuration (MB, HBA, PSU, etc). Many times symptoms that first show up as drive errors can be attributed to other factors, and changing drives can actually make things worse instead of better.

Do you have a complete set of backups for all files you don't want to lose?

Quote

May 29, 20188 yr

Author

10 minutes ago, jonathanm said:

Not an indicator of health. I have many drives still in perfect working order with twice that many working hours.

When you attach your diagnostics zip file, also list your general configuration (MB, HBA, PSU, etc). Many times symptoms that first show up as drive errors can be attributed to other factors, and changing drives can actually make things worse instead of better.

Do you have a complete set of backups for all files you don't want to lose?

No backups but it's just media that can be acquired again should the worst happen. Will post diag tonight after kiddos get to bed.

Quote

May 29, 20188 yr

Author

I'd have to open the box to find out what PSU i'm running but I think it was a thermaltake 800w.

Here is the diag and screenshot of system specs from unraid info.

mediaserver-smart-20180527-1617.zip

Quote

May 30, 20188 yr

53 minutes ago, AJ Ouellet said:

I'd have to open the box to find out what PSU i'm running but I think it was a thermaltake 800w.

Here is the diag and screenshot of system specs from unraid info.

mediaserver-smart-20180527-1617.zip

No, that is not the diagnostics. It is just SMART for a single disk.

21 hours ago, trurl said:

Tools - Diagnostics, post complete zip

Quote

May 30, 20188 yr

Author

Not sure what I did before, was in a rush. This should have it all.

mediaserver-diagnostics-20180529-2118.zip

Quote

May 30, 20188 yr

15 hours ago, pwm said:

The CRC errors are normally not disk errors - they are caused by failed transfers between disk and controller and are quite often caused by the cable.

And if you recently upgraded from some version before 6.4 it may be that you are just now noticing these because the new version is now monitoring them by default. They might be old connection issues that aren't even occurring now. You can acknowledge them and you won't get notified again unless they increase.

I don't see any reason to replace any disks. None are disabled and no serious SMART issues. If it ain't broke don't fix it.

Quote

May 30, 20188 yr

Author

1 hour ago, trurl said:

And if you recently upgraded from some version before 6.4 it may be that you are just now noticing these because the new version is now monitoring them by default. They might be old connection issues that aren't even occurring now. You can acknowledge them and you won't get notified again unless they increase.

I don't see any reason to replace any disks. None are disabled and no serious SMART issues. If it ain't broke don't fix it.

Ok thanks. I'll acknowledge them and see if they come back. I have 1 more port so i'll be able to add another 4tb... Now to find a card to add a few more ports so I can use the rest of the drives.

Quote

May 30, 20188 yr

5 hours ago, AJ Ouellet said:

Ok thanks. I'll acknowledge them and see if they come back. I have 1 more port so i'll be able to add another 4tb... Now to find a card to add a few more ports so I can use the rest of the drives.

I recommend only adding disks as needed to increase capacity. Adding disks just because you have them is only adding more points of failure. Even more so if you have to add a controller just to add the disks.

I guess the counter-argument to that is your warranty clock is ticking, but you could just test them really well with preclear or something and then set them aside so they aren't accumulating power-on hours.

Quote

What is the proper way to replace these drives?

Featured Replies

Archived

Account

Navigation

Search

Configure browser push notifications

Chrome (Android)

Chrome (Desktop)

Safari (iOS 16.4+)

Safari (macOS)

Edge (Android)

Edge (Desktop)

Firefox (Android)

Firefox (Desktop)