SPOautos Posted August 24, 2020 Share Posted August 24, 2020 I just built this server and just now installed Unraid for the first time. The HDD came from another machine where they were working great, they were wiped, and a health test/check was ran and showed they were in perfect health. It is four 6TB HGST drives that are 2 or 3 years old but not worked hard. Unraid is showing one of them with a error. On the Self Test screen, under Attributes to the left there is a column of numbers...1-12 and 192-199 and in row 198 is says "offline uncorrectable". Here are all the values going across.... Flag: 0x0008 Value: 100 Worst: 100 Threshold: 000 Type: Old Age Updated: offline Failed: never Raw Value: 13 I have no idea what all that means. Why does it say Old Age? They are a couple years old but not worked very hard, not sure what constitutes "old age" I started running the "SMART extended self-test" but it looks like it's going to take a very long time before I have results to share about that. Should I stop the extended self test and do something else instead? Theres no data on the drive. Quote Link to comment
JorgeB Posted August 24, 2020 Share Posted August 24, 2020 Old age is just the type of attribute, while a non zero attribute for offline uncorrectable is never good it doesn't mean that the disk is failing, wait for the extended SMART test, if it passes disk is good for now, and as long as that value doesn't increase it would be fine. 1 Quote Link to comment
SPOautos Posted August 24, 2020 Author Share Posted August 24, 2020 (edited) 23 minutes ago, johnnie.black said: Old age is just the type of attribute, while a non zero attribute for offline uncorrectable is never good it doesn't mean that the disk is failing, wait for the extended SMART test, if it passes disk is good for now, and as long as that value doesn't increase it would be fine. Thank You for the info Do you have a rough idea how long the extended SMART test should take, being a 6Tb drive with no data. Is the "offline uncorrectable" raw value of 13 telling me that it found 13 bad sectors? Edited August 24, 2020 by SPOautos Quote Link to comment
JorgeB Posted August 24, 2020 Share Posted August 24, 2020 6 minutes ago, SPOautos said: Do you have a rough idea how long the extended SMART test should take, being a 6Tb drive with no data. 2/3 hours per TB, you can also see the estimated time for that specific drive in the SMART report, this line: Extended self-test routine recommended polling time: (1128) minutes. 7 minutes ago, SPOautos said: Is the "offline uncorrectable" raw value of 13 telling me that it found 13 bad sectors? Possibly but not necessarily, different firmwares have sometimes different behavior regarding on how they handle bad sectors. 1 Quote Link to comment
SPOautos Posted August 25, 2020 Author Share Posted August 25, 2020 8 hours ago, johnnie.black said: 2/3 hours per TB, you can also see the estimated time for that specific drive in the SMART report, this line: Extended self-test routine recommended polling time: (1128) minutes. Possibly but not necessarily, different firmwares have sometimes different behavior regarding on how they handle bad sectors. The extended check is at 90% so pretty close to finished. Even if it clears this disk, should I use it as a data disk or my parity disk? I'm thinking parity because if it goes bad and I lose it I replace the disk and the parity is rebuild and my data doesnt risk corruption or something. But I'm FAR from a IT person....what do you think? Quote Link to comment
SPOautos Posted August 25, 2020 Author Share Posted August 25, 2020 (edited) 11 hours ago, johnnie.black said: 2/3 hours per TB, you can also see the estimated time for that specific drive in the SMART report, this line: Extended self-test routine recommended polling time: (1128) minutes. Possibly but not necessarily, different firmwares have sometimes different behavior regarding on how they handle bad sectors. The results of the finished extended test is error free. Yet on the dashboard it still lists that drive with a orange thumbs down saying errors. Do I need to reboot the server or should I do something else? Any other type of test? I'm not sure what I should do from here. Edited August 25, 2020 by SPOautos Quote Link to comment
ChatNoir Posted August 25, 2020 Share Posted August 25, 2020 It depends on what is the origin of the thumb down. You should post your diagnostics so the guys that know SMART can provide suggestions on what to do next. Quote Link to comment
SPOautos Posted August 25, 2020 Author Share Posted August 25, 2020 3 hours ago, ChatNoir said: It depends on what is the origin of the thumb down. You should post your diagnostics so the guys that know SMART can provide suggestions on what to do next. I just installed unraid for the first time yesterday.....so please excuse the ignorance lol.....which diagnostics and how do I get the diagnostics from unraid to here. I dont have any plugins or dockers setup yet and not even an array (was waiting on this before I start labeling drives). I saw where there is the option to download the results of my extended test but I dont have email or anything like that in Unraid to get it out of Unraid. Quote Link to comment
JorgeB Posted August 25, 2020 Share Posted August 25, 2020 Tools -> Diagnostics then attach the complete zip. 1 Quote Link to comment
ChatNoir Posted August 25, 2020 Share Posted August 25, 2020 No problem, I saw 30+ posts on your account so I supposed incorrectly that you had more exprience on unraid. The diagnostics gather tons of anonymized informations about your system, (SMART, shares, CPU, memory, etc). It is available in Tools / Diagnostics. 1 Quote Link to comment
SPOautos Posted August 25, 2020 Author Share Posted August 25, 2020 12 minutes ago, johnnie.black said: Tools -> Diagnostics then attach the complete zip. is this correct? tower-diagnostics-20200825-0521.zip Quote Link to comment
ChatNoir Posted August 25, 2020 Share Posted August 25, 2020 Yes ! I see that your disk sdc has indeed : 198 Offline_Uncorrectable ---R-- 100 100 000 - 13 Let's see what the expert can make of that, I am learning as you are. Quote Link to comment
SPOautos Posted August 25, 2020 Author Share Posted August 25, 2020 1 minute ago, ChatNoir said: Yes ! I see that your disk sdc has indeed : 198 Offline_Uncorrectable ---R-- 100 100 000 - 13 Let's see what the expert can make of that, I am learning as you are. Yes, that's what prompted the original post. But then when I ran the extended smart test, it found zero errors. I guess the errors were in the past? I have no idea really. Quote Link to comment
SPOautos Posted August 25, 2020 Author Share Posted August 25, 2020 29 minutes ago, johnnie.black said: Tools -> Diagnostics then attach the complete zip. Thought Id also add the results of the SMART test..... tower-smart-20200825-0541.zip Quote Link to comment
ChatNoir Posted August 25, 2020 Share Posted August 25, 2020 It is already included in the Diagnostics, it is where I looked. Quote Link to comment
hawihoney Posted August 25, 2020 Share Posted August 25, 2020 7 hours ago, SPOautos said: orange thumbs down The orange mark comes from the setting shown below. The global SMART settings under Settings/Disk settings allow to define what SMART values lead to this orange mark. 198 Offline_Uncorrectable belongs to them: 1 Quote Link to comment
SPOautos Posted August 25, 2020 Author Share Posted August 25, 2020 34 minutes ago, hawihoney said: The orange mark comes from the setting shown below. The global SMART settings under Settings/Disk settings allow to define what SMART values lead to this orange mark. 198 Offline_Uncorrectable belongs to them: Ahhhh.....something tells me I'll be learning something new every day for a longggg time. Thank You! Well hopefully someone FAR more knowledgeable than me can look at this stuff and tell me if its decently safe to use this HDD. Quote Link to comment
JorgeB Posted August 25, 2020 Share Posted August 25, 2020 SMART test completed successfully, so disk is good for now, but it failed two previous ones: # 1 Extended offline Completed without error 00% 21384 - # 2 Extended offline Completed: read failure 90% 21288 9432616 # 3 Short offline Completed: read failure 10% 21127 9432616 For now acknowledge the SMART attributes but any more read errors in the near future best to replace it. 1 Quote Link to comment
SPOautos Posted August 25, 2020 Author Share Posted August 25, 2020 18 minutes ago, johnnie.black said: SMART test completed successfully, so disk is good for now, but it failed two previous ones: # 1 Extended offline Completed without error 00% 21384 - # 2 Extended offline Completed: read failure 90% 21288 9432616 # 3 Short offline Completed: read failure 10% 21127 9432616 For now acknowledge the SMART attributes but any more read errors in the near future best to replace it. Could that be because something happened like the computer shutting off during testing? Is it odd that it didnt pass at one point and now it does? Do you think I should make this one my parity disk as opposed to a data disk? That way if it gets flakey I just replace it and it recreates the parity data. That way it doesnt risk corrupting my actual data. Quote Link to comment
itimpi Posted August 25, 2020 Share Posted August 25, 2020 2 minutes ago, SPOautos said: Do you think I should make this one my parity disk as opposed to a data disk? That way if it gets flakey I just replace it and it recreates the parity data. That way it doesnt risk corrupting my actual data. I would say it should NOT be your parity disk. If a disk fails then Unraid relies on being able to reliably read ALL the other drives plus parity to recreate the contents of the failed drive. Therefore if you put your unreliable disk in as parity you have no confidence you can recover a failed array disk with its contents uncorrupted. 1 Quote Link to comment
SPOautos Posted August 25, 2020 Author Share Posted August 25, 2020 (edited) 1 hour ago, itimpi said: I would say it should NOT be your parity disk. If a disk fails then Unraid relies on being able to reliably read ALL the other drives plus parity to recreate the contents of the failed drive. Therefore if you put your unreliable disk in as parity you have no confidence you can recover a failed array disk with its contents uncorrupted. Okay, thinking about it, I have four 6TB drives total. I can use 3 for data and 1 for parity. I only have about 8Tb of data right now so I'll have one data drive just sitting empty which could be this one that's had errors before. OR does Unraid automatically spread the data out across all three data drives and use them all? Edited August 25, 2020 by SPOautos Quote Link to comment
itimpi Posted August 25, 2020 Share Posted August 25, 2020 The way that Unraid spreads data across drives depends on the settings you have set up for specific shares. if you do not need all drives at the moment I would recommend that you do not even add them to the array, but keep it available for when you actually need it. That could be as an additional drive if you need the space or as a replacement for a failed drive. Each additional drive is a potential additional point of failure so why not avoid even that small chance at this point. 1 Quote Link to comment
hawihoney Posted August 25, 2020 Share Posted August 25, 2020 I'm just a user, no SMART specialist. I can tell you what I would do in your situation. But it's up to you. If I'm running single parity, I would throw that drive out of the array. I would RMA if it's possible. If I'm running double parity I would keep that drive but I would have an eye on that disk. As long there are no increasing failures/errors and the other disks are in perfect shape in a double parity system I would go that way. I had disks with all kind of errors. When errors appeared once and didn't increase - there's a chance that they will run for a long time, On the other side I had lots of disks dying really fast. So it's a lottery. There's no guarantee. Take backups. 1 Quote Link to comment
trurl Posted August 25, 2020 Share Posted August 25, 2020 10 minutes ago, hawihoney said: Take backups. This is the most important advice. You must always have another copy of anything important and irreplaceable. 1 Quote Link to comment
SPOautos Posted August 25, 2020 Author Share Posted August 25, 2020 1 hour ago, itimpi said: The way that Unraid spreads data across drives depends on the settings you have set up for specific shares. Is there somewhere I can read more about the different ways Unraid can spread data, the settings, and any advantages/disadvantages. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.