digitalagent Posted October 2, 2018 Share Posted October 2, 2018 (edited) Hi, I've been using unRAID for about a year now with ZERO issues whatsoever, i love this thing. When i first configured my array I setup email notification's as the guide I used for configuration suggested and have been pleased with all the many "PASS" emails I've received. This morning i got my first heart attack from the system and got a "FAIL" message with the below error message: Disk 7 - ST2000DM001-1CH164_Z1E42NBH (sdc) - active 24 C (disk has read errors) [NOK] I'll say I'm above average when it comes to tech savviness, storage, and have even dealt with enterprise grade storage in one form of another for the past 7 years in my career. But when it comes to my own personal data on unRAID there's some parts that are still new to me. After doing some Googling, browsing the forums, and even Reddit, I've seen some mixed results on my next course of action and figure its best to ask here. I've seen mixed recommendations of people saying the filesystem on that drive needs to be fixed etc. Currently i turned off Plex so i don't have any read requests going to the array. The only other items that connect to it is my main desktop PC and a Firestick (both i stopped accessing any files as soon as i saw the error message this morning). The array is currently online, and parity is valid. Is it as simple as just stopping the array, powering down, pulling that disk and replacing with a new one (which i know a pre-clear should be run on) or do i need to do anything with the file system on disk 7? I also saw another recommendation of after the rebuild being complete to run a parity check with "write corrections to parity" turned off. I've attached my diagnostic file to this post and am looking forward to any responses. Data that's on the array is a backup of my personal files on my main PC (documents, photos, music) and my DVD rips. Thanks! System Info: unRAID v6.2.1 Intel Core i3-4160 6GB DDR3 (2x3GB sticks) Kingston 8GB USB Flash for OS Array of nine devices - 20TB Parity - 4 TB 1x4TB 2x3TB 5x2TB The only docker I'm using is Plex Media Server. nas-diagnostics-20181001-1938.zip Edited October 9, 2018 by digitalagent Quote Link to comment
trurl Posted October 2, 2018 Share Posted October 2, 2018 A few read errors don't typically call for replacing a disk, and Unraid hasn't disabled it. Try running an extended SMART test on it. Click on the disk in Main to get to its page. That's a pretty old version of Unraid you're running. You should consider upgrading after you get confident in your array again. Old versions are more difficult for us to support. Quote Link to comment
JorgeB Posted October 2, 2018 Share Posted October 2, 2018 This is an error that I only see happen with Seagates: Error: UNC at LBA = 0x0fffffff = 268435455 UNC error at a bogus LBA address, the problem was the disk, but sometimes it can go a long time without more errors, though if you keep getting them time to replace it. Quote Link to comment
digitalagent Posted October 2, 2018 Author Share Posted October 2, 2018 So i ran the long SMART extended test. Says it completed without error. I'm not the best at reviewing SMART results, only thing that concerned me was the 6 uncorrectables SMART Attributes Data Structure revision number: 10 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000f 111 099 006 Pre-fail Always - 31415680 3 Spin_Up_Time 0x0003 096 095 000 Pre-fail Always - 0 4 Start_Stop_Count 0x0032 100 100 020 Old_age Always - 87 5 Reallocated_Sector_Ct 0x0033 100 100 010 Pre-fail Always - 0 7 Seek_Error_Rate 0x000f 069 060 030 Pre-fail Always - 10013517 9 Power_On_Hours 0x0032 067 067 000 Old_age Always - 29602 10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0 12 Power_Cycle_Count 0x0032 100 100 020 Old_age Always - 87 183 Runtime_Bad_Block 0x0032 100 100 000 Old_age Always - 0 184 End-to-End_Error 0x0032 100 100 099 Old_age Always - 0 187 Reported_Uncorrect 0x0032 094 094 000 Old_age Always - 6 188 Command_Timeout 0x0032 100 100 000 Old_age Always - 0 0 0 189 High_Fly_Writes 0x003a 087 087 000 Old_age Always - 13 190 Airflow_Temperature_Cel 0x0022 076 052 045 Old_age Always - 24 (Min/Max 23/29) 191 G-Sense_Error_Rate 0x0032 100 100 000 Old_age Always - 0 192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 19 193 Load_Cycle_Count 0x0032 084 084 000 Old_age Always - 33153 194 Temperature_Celsius 0x0022 024 048 000 Old_age Always - 24 (0 10 0 0 0) 197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0 240 Head_Flying_Hours 0x0000 100 253 000 Old_age Offline - 1696h+59m+50.082s 241 Total_LBAs_Written 0x0000 100 253 000 Old_age Offline - 9984744079 242 Total_LBAs_Read 0x0000 100 253 000 Old_age Offline - 55994108160 Should i consider replacing the drive or just keep an eye on it and see if those numbers increase? thanks! nas-smart-20181001-2215.zip Quote Link to comment
JorgeB Posted October 2, 2018 Share Posted October 2, 2018 I would likely give it a second chance, but if more similar errors soon replace. Quote Link to comment
digitalagent Posted October 3, 2018 Author Share Posted October 3, 2018 Hmmmm. So i am seeing on the main page for disk 7 an error count of 768(not sure the total number previously), while all the others are at 0. I do have a spare brand new 3TB drive laying around and would just feel comfortable with replacing it. Is it as simple as just following this guide: https://wiki.unraid.net/Replacing_a_Data_Drive or do i need to worry about any type of corrupt files/file system repair? Quote Link to comment
JorgeB Posted October 3, 2018 Share Posted October 3, 2018 5 hours ago, digitalagent said: Is it as simple as just following this guide: https://wiki.unraid.net/Replacing_a_Data_Drive Yes 5 hours ago, digitalagent said: or do i need to worry about any type of corrupt files/file system repair? Only if the disk becomes unmountable or there are any filesystem related issues on the log. Quote Link to comment
digitalagent Posted October 9, 2018 Author Share Posted October 9, 2018 Thanks Johnnie and Trurl. Another 300 errors got added onto that disk and i ended up replacing it. Rebuild completed with 0 errors. Thanks for the guidance Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.