One of my HDD shows a error

SPOautos · August 24, 2020

I just built this server and just now installed Unraid for the first time.

The HDD came from another machine where they were working great, they were wiped, and a health test/check was ran and showed they were in perfect health. It is four 6TB HGST drives that are 2 or 3 years old but not worked hard.

Unraid is showing one of them with a error.

On the Self Test screen, under Attributes to the left there is a column of numbers...1-12 and 192-199 and in row 198 is says "offline uncorrectable". Here are all the values going across....

Flag: 0x0008

Value: 100

Worst: 100

Threshold: 000

Type: Old Age

Updated: offline

Failed: never

Raw Value: 13

I have no idea what all that means. Why does it say Old Age? They are a couple years old but not worked very hard, not sure what constitutes "old age"

I started running the "SMART extended self-test" but it looks like it's going to take a very long time before I have results to share about that.

Should I stop the extended self test and do something else instead?

Theres no data on the drive.

JorgeB · August 24, 2020

Old age is just the type of attribute, while a non zero attribute for offline uncorrectable is never good it doesn't mean that the disk is failing, wait for the extended SMART test, if it passes disk is good for now, and as long as that value doesn't increase it would be fine.

SPOautos · August 24, 2020

23 minutes ago, johnnie.black said:

Old age is just the type of attribute, while a non zero attribute for offline uncorrectable is never good it doesn't mean that the disk is failing, wait for the extended SMART test, if it passes disk is good for now, and as long as that value doesn't increase it would be fine.

Thank You for the info

Do you have a rough idea how long the extended SMART test should take, being a 6Tb drive with no data.

Is the "offline uncorrectable" raw value of 13 telling me that it found 13 bad sectors?

Edited August 24, 2020 by SPOautos

JorgeB · August 24, 2020

6 minutes ago, SPOautos said:

Do you have a rough idea how long the extended SMART test should take, being a 6Tb drive with no data.

2/3 hours per TB, you can also see the estimated time for that specific drive in the SMART report, this line:

Extended self-test routine
recommended polling time:      (1128) minutes.

7 minutes ago, SPOautos said:

Is the "offline uncorrectable" raw value of 13 telling me that it found 13 bad sectors?

Possibly but not necessarily, different firmwares have sometimes different behavior regarding on how they handle bad sectors.

SPOautos · August 25, 2020

8 hours ago, johnnie.black said:
2/3 hours per TB, you can also see the estimated time for that specific drive in the SMART report, this line:
Extended self-test routine
recommended polling time:      (1128) minutes.
Possibly but not necessarily, different firmwares have sometimes different behavior regarding on how they handle bad sectors.

The extended check is at 90% so pretty close to finished. Even if it clears this disk, should I use it as a data disk or my parity disk? I'm thinking parity because if it goes bad and I lose it I replace the disk and the parity is rebuild and my data doesnt risk corruption or something.

But I'm FAR from a IT person....what do you think?

SPOautos · August 25, 2020

11 hours ago, johnnie.black said:
2/3 hours per TB, you can also see the estimated time for that specific drive in the SMART report, this line:
Extended self-test routine
recommended polling time:      (1128) minutes.
Possibly but not necessarily, different firmwares have sometimes different behavior regarding on how they handle bad sectors.

The results of the finished extended test is error free. Yet on the dashboard it still lists that drive with a orange thumbs down saying errors. Do I need to reboot the server or should I do something else? Any other type of test?

I'm not sure what I should do from here.

Edited August 25, 2020 by SPOautos

ChatNoir · August 25, 2020

It depends on what is the origin of the thumb down.

You should post your diagnostics so the guys that know SMART can provide suggestions on what to do next.

SPOautos · August 25, 2020

3 hours ago, ChatNoir said:

It depends on what is the origin of the thumb down.

You should post your diagnostics so the guys that know SMART can provide suggestions on what to do next.

I just installed unraid for the first time yesterday.....so please excuse the ignorance lol.....which diagnostics and how do I get the diagnostics from unraid to here. I dont have any plugins or dockers setup yet and not even an array (was waiting on this before I start labeling drives).

I saw where there is the option to download the results of my extended test but I dont have email or anything like that in Unraid to get it out of Unraid.

JorgeB · August 25, 2020

Tools -> Diagnostics then attach the complete zip.

ChatNoir · August 25, 2020

No problem, I saw 30+ posts on your account so I supposed incorrectly that you had more exprience on unraid.

The diagnostics gather tons of anonymized informations about your system, (SMART, shares, CPU, memory, etc).

It is available in Tools / Diagnostics.

image.png.143538aa0ab4a6e78f20534c4507f1a8.png

SPOautos · August 25, 2020

12 minutes ago, johnnie.black said:

Tools -> Diagnostics then attach the complete zip.

is this correct?

tower-diagnostics-20200825-0521.zip

ChatNoir · August 25, 2020

Yes !

I see that your disk sdc has indeed :

198 Offline_Uncorrectable ---R-- 100 100 000 - 13

Let's see what the expert can make of that, I am learning as you are.

SPOautos · August 25, 2020

1 minute ago, ChatNoir said:

Yes !

I see that your disk sdc has indeed :

198 Offline_Uncorrectable ---R-- 100 100 000 - 13

Let's see what the expert can make of that, I am learning as you are.

Yes, that's what prompted the original post. But then when I ran the extended smart test, it found zero errors. I guess the errors were in the past? I have no idea really.

SPOautos · August 25, 2020

29 minutes ago, johnnie.black said:

Tools -> Diagnostics then attach the complete zip.

Thought Id also add the results of the SMART test.....

tower-smart-20200825-0541.zip

ChatNoir · August 25, 2020

It is already included in the Diagnostics, it is where I looked.

hawihoney · August 25, 2020

7 hours ago, SPOautos said:

orange thumbs down

The orange mark comes from the setting shown below. The global SMART settings under Settings/Disk settings allow to define what SMART values lead to this orange mark. 198 Offline_Uncorrectable belongs to them:

SPOautos · August 25, 2020

34 minutes ago, hawihoney said:

The orange mark comes from the setting shown below. The global SMART settings under Settings/Disk settings allow to define what SMART values lead to this orange mark. 198 Offline_Uncorrectable belongs to them:

Ahhhh.....something tells me I'll be learning something new every day for a longggg time. Thank You!

Well hopefully someone FAR more knowledgeable than me can look at this stuff and tell me if its decently safe to use this HDD.

JorgeB · August 25, 2020

SMART test completed successfully, so disk is good for now, but it failed two previous ones:

# 1  Extended offline    Completed without error       00%     21384         -
# 2  Extended offline    Completed: read failure       90%     21288         9432616
# 3  Short offline       Completed: read failure       10%     21127         9432616

For now acknowledge the SMART attributes but any more read errors in the near future best to replace it.

SPOautos · August 25, 2020

18 minutes ago, johnnie.black said:
SMART test completed successfully, so disk is good for now, but it failed two previous ones:
# 1  Extended offline    Completed without error       00%     21384         -
# 2  Extended offline    Completed: read failure       90%     21288         9432616
# 3  Short offline       Completed: read failure       10%     21127         9432616
For now acknowledge the SMART attributes but any more read errors in the near future best to replace it.

Could that be because something happened like the computer shutting off during testing?

Is it odd that it didnt pass at one point and now it does?

Do you think I should make this one my parity disk as opposed to a data disk? That way if it gets flakey I just replace it and it recreates the parity data. That way it doesnt risk corrupting my actual data.

itimpi · August 25, 2020

2 minutes ago, SPOautos said:

Do you think I should make this one my parity disk as opposed to a data disk? That way if it gets flakey I just replace it and it recreates the parity data. That way it doesnt risk corrupting my actual data.

I would say it should NOT be your parity disk. If a disk fails then Unraid relies on being able to reliably read ALL the other drives plus parity to recreate the contents of the failed drive. Therefore if you put your unreliable disk in as parity you have no confidence you can recover a failed array disk with its contents uncorrupted.

SPOautos · August 25, 2020

1 hour ago, itimpi said:

I would say it should NOT be your parity disk. If a disk fails then Unraid relies on being able to reliably read ALL the other drives plus parity to recreate the contents of the failed drive. Therefore if you put your unreliable disk in as parity you have no confidence you can recover a failed array disk with its contents uncorrupted.

Okay, thinking about it, I have four 6TB drives total. I can use 3 for data and 1 for parity. I only have about 8Tb of data right now so I'll have one data drive just sitting empty which could be this one that's had errors before. OR does Unraid automatically spread the data out across all three data drives and use them all?

Edited August 25, 2020 by SPOautos

itimpi · August 25, 2020

The way that Unraid spreads data across drives depends on the settings you have set up for specific shares.

if you do not need all drives at the moment I would recommend that you do not even add them to the array, but keep it available for when you actually need it. That could be as an additional drive if you need the space or as a replacement for a failed drive. Each additional drive is a potential additional point of failure so why not avoid even that small chance at this point.

hawihoney · August 25, 2020

I'm just a user, no SMART specialist. I can tell you what I would do in your situation. But it's up to you.

If I'm running single parity, I would throw that drive out of the array. I would RMA if it's possible.

If I'm running double parity I would keep that drive but I would have an eye on that disk.

As long there are no increasing failures/errors and the other disks are in perfect shape in a double parity system I would go that way.

I had disks with all kind of errors. When errors appeared once and didn't increase - there's a chance that they will run for a long time, On the other side I had lots of disks dying really fast.

So it's a lottery. There's no guarantee. Take backups.

trurl · August 25, 2020

10 minutes ago, hawihoney said:

Take backups.

This is the most important advice. You must always have another copy of anything important and irreplaceable.

SPOautos · August 25, 2020

1 hour ago, itimpi said:

The way that Unraid spreads data across drives depends on the settings you have set up for specific shares.

Is there somewhere I can read more about the different ways Unraid can spread data, the settings, and any advantages/disadvantages.

One of my HDD shows a error

Recommended Posts

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Join the conversation