After Parity Check - UNRAID gives disk read errors (SOLVED)


Recommended Posts

I precleared and have this 4TB WD Red running smoothly for two years already. I ran Parity Check many times throughout the years, but today's Parity Check (with parity-correction turned off) gave me disk read errors. Should I be concerned and copy all my data to another disk? Should I RMA this drive ASAP? I have attached my SMART log text file.

 

Thank you

 

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAGS    VALUE WORST THRESH FAIL RAW_VALUE
  1 Raw_Read_Error_Rate     POSR-K   200   200   051    -    68
  3 Spin_Up_Time            POS--K   187   180   021    -    7608
  4 Start_Stop_Count        -O--CK   099   099   000    -    1677
  5 Reallocated_Sector_Ct   PO--CK   200   200   140    -    0
  7 Seek_Error_Rate         -OSR-K   100   253   000    -    0
  9 Power_On_Hours          -O--CK   066   066   000    -    25394
 10 Spin_Retry_Count        -O--CK   100   100   000    -    0
 11 Calibration_Retry_Count -O--CK   100   100   000    -    0
 12 Power_Cycle_Count       -O--CK   100   100   000    -    200
192 Power-Off_Retract_Count -O--CK   200   200   000    -    21
193 Load_Cycle_Count        -O--CK   190   190   000    -    32749
194 Temperature_Celsius     -O---K   120   108   000    -    32
196 Reallocated_Event_Count -O--CK   200   200   000    -    0
197 Current_Pending_Sector  -O--CK   200   200   000    -    0
198 Offline_Uncorrectable   ----CK   100   253   000    -    0
199 UDMA_CRC_Error_Count    -O--CK   200   200   000    -    0
200 Multi_Zone_Error_Rate   ---R--   200   200   000    -    0
                            ||||||_ K auto-keep
                            |||||__ C event count
                            ||||___ R error rate
                            |||____ S speed/performance
                            ||_____ O updated online
                            |______ P prefailure warning

 

Jul 17 22:22:47 unRAID kernel: md: disk2 read error, sector=3568898920
Jul 17 22:22:47 unRAID kernel: md: disk2 read error, sector=3568898928
Jul 17 22:22:47 unRAID kernel: md: disk2 read error, sector=3568898936
Jul 17 22:22:47 unRAID kernel: md: disk2 read error, sector=3568898944
Jul 17 22:22:47 unRAID kernel: md: disk2 read error, sector=3568898952
Jul 17 22:22:47 unRAID kernel: md: disk2 read error, sector=3568898960
Jul 17 22:22:47 unRAID kernel: md: disk2 read error, sector=3568898968
Jul 17 22:22:47 unRAID kernel: md: disk2 read error, sector=3568898976
Jul 17 22:22:47 unRAID kernel: md: disk2 read error, sector=3568898984
Jul 17 22:22:47 unRAID kernel: md: disk2 read error, sector=3568898992
Jul 17 22:22:47 unRAID kernel: md: disk2 read error, sector=3568899000
Jul 17 22:22:47 unRAID kernel: md: disk2 read error, sector=3568899008
Jul 17 22:22:47 unRAID kernel: md: disk2 read error, sector=3568899016
Jul 17 22:22:47 unRAID kernel: md: disk2 read error, sector=3568899024
Jul 17 22:22:47 unRAID kernel: md: disk2 read error, sector=3568899032
Jul 17 22:22:47 unRAID kernel: md: disk2 read error, sector=3568899040
Jul 17 22:22:47 unRAID kernel: md: disk2 read error, sector=3568899048
Jul 17 22:22:47 unRAID kernel: md: disk2 read error, sector=3568899056
Jul 17 22:22:47 unRAID kernel: md: disk2 read error, sector=3568899064
Jul 17 22:22:47 unRAID kernel: md: disk2 read error, sector=3568899072
Jul 17 22:22:47 unRAID kernel: md: disk2 read error, sector=3568899080
Jul 17 22:22:47 unRAID kernel: md: disk2 read error, sector=3568899088
Jul 17 22:22:47 unRAID kernel: md: disk2 read error, sector=3568899096
Jul 17 22:22:47 unRAID kernel: md: disk2 read error, sector=3568899104
Jul 17 22:22:47 unRAID kernel: md: disk2 read error, sector=3568899112
Jul 17 22:22:47 unRAID kernel: md: disk2 read error, sector=3568899120
Jul 17 22:22:47 unRAID kernel: md: disk2 read error, sector=3568899128
Jul 17 22:22:47 unRAID kernel: md: disk2 read error, sector=3568899136
Jul 17 22:22:47 unRAID kernel: md: disk2 read error, sector=3568899144
Jul 17 22:22:47 unRAID kernel: md: disk2 read error, sector=3568899152
Jul 17 22:22:47 unRAID kernel: md: disk2 read error, sector=3568899160
Jul 17 22:22:47 unRAID kernel: md: disk2 read error, sector=3568899168
Jul 17 22:22:47 unRAID kernel: md: disk2 read error, sector=3568899176
Jul 17 22:22:47 unRAID kernel: md: disk2 read error, sector=3568899184
Jul 18 04:25:26 unRAID kernel: md: sync done. time=35008sec
Jul 18 04:25:26 unRAID kernel: md: recovery thread: completion status: 0

 

 

WDC_WD40EFRX-68WT0N0_WD-WCC4E0394858-20180718-0708.txt

Edited by mgsvr
Link to comment

Thank you, Johnnie.

 

I will have the disk RMA back to WD. I noticed other disks have read disk errors with raw value between 2 to 6; however, UNRAID has not report them yet. 

 

For the failing Disk 2 disk read errors, will all my media files remain intact without any lost or altered bit? I have lots of ripped movies and it would be bad if I have to redo it again. 

Link to comment
Just now, mgsvr said:

I will have the disk RMA back to WD. I noticed other disks have read disk errors with raw value between 2 to 6

That's never a good sign, but single digits are usually OK.

 

1 minute ago, mgsvr said:

For the failing Disk 2 disk read errors, will all my media files remain intact without any lost or altered bit? I have lots of ripped movies and it would be bad if I have to redo it again. 

They should all be fine as long as parity is valid and you replace the disk with a new one letting unRAID rebuild it, but the only way to be sure would be check them after it's rebuilt with checksums (if you have them) or are using btrfs.

Link to comment

What is better for me to do between having UNRAID rebuild the data versus copy the data from backup to a new drive?

 

My unraid currently has 1 Parity, and 3x 4TB disks. I also have a 8TB disk mount on Unassigned Devices to backup Disk 1 and Disk2. I have not do any backup yet, though. I just bought and had the 8TB precleared with 2 cycles done, and I plan to do backup this weekend. Suck that I get a disk error today.

 

That's my plan right now for back up because I don't have the money to build another UNRAID backup box yet.

Edited by mgsvr
Link to comment

Ok, I'm going to use my current empty Disk 3 and change into Disk 2 and have unRAID build it.

 

Johnnie, when I SSH into UNRAID and run "nohup rsync -av &" command to backup my share disks into my backup 8TB disk. After I kill the terminal, is there a way to SSH into back into UNRAID to see the running rsync process? Same thing with Midnight Commander. How to login in again to see the process without using something like Screen?

 

I normally log into my Windows 10 VM, and issue MC and leave the terminal open...

 

Thank you again for your help.

Link to comment

I don't use Screen either. Would be nice to SSH into unraid again to see the running nohup rsync process.

 

Looks like I have to wait for WD to send me another disk so I can do a rebuild. I tried unassigned both Disk 2 and 3, but unRAID won't let me start the array. I was planning to use the empty Disk 3 and change it into Disk 2 and rebuild. I guess that's because I only have 1 Parity.

Link to comment
1 minute ago, mgsvr said:

I tried unassigned both Disk 2 and 3, but unRAID won't let me start the array. I was planning to use the empty Disk 3 and change it into Disk 2 and rebuild.

I missed that post earlier, yeah,  you can't do that, even with dual parity you could but there would be no point, you need to use a new disk.

Link to comment

It was a bad idea. I'm glad UNRAID won't let the array startup. I will have to wait for WD's RMA.

 

If we cannot SSH into UNRAID in a new session to see the previous issued "nohup rsync" process, is there a way to kill that rsync run then? For example, I issued a wrong rsync command and close the terminal. Later I SSH back into unraid and since I cannot see the running progress, can I kill the previous run?

 

 

Link to comment

HELP!

 

I accidentally unassigned Disk 2 as no Device and started up the unraid. I checked WD and it turned out that drive is out of warranty. 

 

Is it okay for me to stop array, assign Disk 2 again, and start the array? Unraid now say it is a new device. Will unraid rebuild/parity-sync it? I don't want this operation.

 

I want to assign it again, copy the data off to my backup Unassigned Device. I will buy a 8TB to replace my current 4TB Parity, and use the old 4TB Parity as the new Disk 2. I don't want to buy 4TB drive anymore. Just 8TB from now on.

Link to comment

Thanks, Johnnie.

 

I am an idiot. I am glad you answered quick before I do anymore harmful thing. I have Disk 2 unassigned and started the array again. I will have terminal MC and copy Disk 2 to my unassigned device 8TB.

 

Once I buy another 8TB drive and precleared it, I will pop out the current Parity and assign it as Disk 2.

Copy data from UD to new Disk 2

Assign 8TB as Parity and have it parity sync.

 

Please let me know if I am correct.

 

Link to comment

Thank you very much, Johnnie.

 

Basically just need  swap the old 4TB Parity and assign the 8TB as the new Parity

The old 4TB Parity will assign as Disk 2

 

The Disk 2 "parity content" will be copied into the 8TB Parity

Then unraid will build the Disk 2 

 

This is great! I will do the procedure until the new 8TB I placed arrive. Will use that to copy Disk1 and 2 as backup.

Link to comment

Hi guys,

 

So yesterday I finished precleared the 8TB and then I formatted the drive to XFS. I was planning to use it as Unassigned Device to do backup of my array.

 

Anyway, today I have to do the Parity-swap procedure. Currently UNRAID is copying the old parity into the new parity 8TB.

 

Does UNRAID get rid of the XFS and clear the 8TB drive before it does the copy from the old parity? I understand Parity has no file system. Just want to learn how does UNRAID handle an existed formatted drive (xfs, ntfs..etc) and do this parity copy process.

 

Also, after UNRAID rebuild the old parity drive (now as disk 2), do I need to do a parity-check with error-correction off?

Edited by mgsvr
Link to comment
42 minutes ago, mgsvr said:

Does UNRAID get rid of the XFS and clear the 8TB drive before it does the copy from the old parity?

 

The file system is just data on the sectors. So when unRAID does a raw block-by-block copy of parity data any existing file system data on the target drive gets lost.

Link to comment

Hi guys,

 

Quick question before I mark this topic as solved. So the past month I did a lot of write to my previous Disk 2. The other day after I did a parity check with non-correcting turned off, UNRAID found the Disk 2 with read errors. So I guess the drive has been bad for many months already? The last time I did a parity check was Jan 2018 and with error-correcting TURNED ON (I didn't know it's better to turn it off).

 

After I did a Parity-swap procedure, as suggested by Johnnie, and there was no error after the disk 2 rebuilt. Will all my Disk 2 data be okay and has no corrupted files? I mean the previous bad Disk 2 might update the Parity with wrong information? I never get any Write error, though. Just wondering if my data is corrupted..

 

Thanks.

Link to comment
4 minutes ago, mgsvr said:

with error-correcting TURNED ON (I didn't know it's better to turn it off).

Most likely your data is fine, parity corruption doesn't happen always when a disk has errors during a correcting parity check, in fact it's pretty rare, but still possible, that's why it's recommended to run non correct checks, you can do a spot check of your data but only with checksums (or btrfs) you could check if the rebuilt disk is 100% correct.

 

 

Link to comment

I didn't know about the Dynamix file integrity plugin until last week. I'll do checksum from now on. I also didn't know it's better to leave error-correction uncheck. Hopefully future UNRAID update has that checkbox unchecked by default. Newbies like me will not know about it.

 

So even with the drive has been bad with read errors for couple months, but if I never get any write error when I write data to it last month then Parity most likely be fine? I thought UNRAID read off the data disk to calculate Parity. Sorry, I am so paranoid about the data and I am trying to learn more about UNRAID.

 

Thank you for your help again.

Link to comment
1 minute ago, mgsvr said:

So even with the drive has been bad with read errors for couple months, but if I never get any write error when I write data to it last month then Parity most likely be fine?

Most likely, and almost certainly if you dind't get sync errors during the correcting checks, you should still be able to see that if you click on the parity check history, or have the logs from those checks.

 

The normal unRAID behavior when it finds a read error on a data disk is to use all the other disks plus parity to calculate the correct data and re-write it to the problem disk, this will maintain data integrity as long as parity is valid, the problem is that sometimes when a disk fails it "goes crazy" and instead of doing the above unRAID updates parity with the wrong data, this will make any future rebuild corrupt.

  • Like 1
Link to comment

Hopefully everything is fine as you said. Thank you again for all the helps.

 

1/2018 - Ran parity with write-correction to parity

7/13 - Ran parity, Disk 2 got read errors

7/18 and 7/19 - Parity-swap and disk 2 rebuilt

 

Date Duration Speed Status Errors
2018-07-19, 17:07:33 9 hr, 32 min, 54 sec 232.8 MB/s OK 0
2018-07-18, 04:25:26 9 hr, 43 min, 28 sec 114.3 MB/s OK 0
2018-07-13, 07:07:45 9 hr, 4 min, 18 sec 122.5 MB/s OK 0
2018-01-24, 07:23:19 9 hr, 39 min, 46 sec 115.0 MB/s OK 0
2018-01-18, 20:51:38 27 sec Unavailable Canceled 0
2017-12-14, 06:50:15 9 hr, 39 min, 34 sec 115.1 MB/s OK 0
2017-08-04, 08:13:14 9 hr, 39 min, 23 sec 115.1 MB/s OK 0
2017-02-02, 08:08:48 9 hr, 39 min, 33 sec 115.1 MB/s OK  
2016-09-23, 05:39:05 9 hr, 41 min, 27 sec 114.7 MB/s OK  
Mar, 04, 03:40:22 9 hr, 39 min, 18 sec 115.1 MB/s OK  
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.