Skip to content
View in the app

A better way to browse. Learn more.

Unraid

A full-screen app on your home screen with push notifications, badges and more.

To install this app on iOS and iPadOS
  1. Tap the Share icon in Safari
  2. Scroll the menu and tap Add to Home Screen.
  3. Tap Add in the top-right corner.
To install this app on Android
  1. Tap the 3-dot menu (⋮) in the top-right corner of the browser.
  2. Tap Add to Home screen or Install app.
  3. Confirm by tapping Install.

Failed drive, second drive started throwing errors during rebuild

Featured Replies

I've been using unraid for just over a year.  It has been a great experience with the exception of my stupid choice to use seagate drives... I've lost about 4 or so in just under a year... Time to switch to wd red I think.  I woke up this morning to an email showing that my number 6 drive had failed.  Stopped the array, rebooted the system hoping the drive would come back.  No such luck.  Found the offensive drive in my rig and replaced it with another drive.  Logged into unraid and verified that all the other drives were fine, had the blue dot on my drive six replacement, started the array and started the rebuild process. 

 

I just checked on the rebuild completion status and noticed now that disk 8 has * for temperature and 171141458 errors now as well.  Im am thoroughly afraid of losing data.  Advice on how to proceed is greatly appreciated from this great community.

unraidfails_Copy.jpg.143adaf0c2d222b0715d0cf6c6cba48f.jpg

I would post a syslog.  I have a feeling that there's going to be alot of of ATA errors in it relating to disk 8.  (Probably cable related since you just swapped out a drive) - I'm  a HUGE fan of hotswap bays because of this.

 

Unfortunately you *may* have some corruption on disk 8 because of those errors.  When unRaid detects a read error, what it does is reads all of the other drives to calculate what the appropriate data should be and then writes it back to the appropriate drive.  Unfortunately, since the one drive (6) was in the middle of a reconstruction, the data which it read may or may not have been valid, so the data written to 8 may or may not be correct.  (this may be a bug in unRaid - in my opinion it shouldn't automatically correct read errors to a drive if the system is undergoing a rebuild at the time)

 

I had a similar problem about a year ago, and am still finding the odd movie which doesn't play correctly that is stored on the offending drive.

 

(as an aside, I now also have MD5 checksums for everything stored on the drives so that if this problem ever happens I could easily discover the problem files)

  • Author

the entire syslog was 16mb in size... here is a portion..  If there is a better way to post a syslog, please let me know.

 

Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137761704
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137761712
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137761720
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137761728
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137761736
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137761744
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137761752
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137761760
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137761768
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137761776
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137761784
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137761792
Jan  1 11:57:59 Tower kernel: sd 10:0:0:0: [sdl] Unhandled error code
Jan  1 11:57:59 Tower kernel: sd 10:0:0:0: [sdl]  
Jan  1 11:57:59 Tower kernel: Result: hostbyte=0x04 driverbyte=0x00
Jan  1 11:57:59 Tower kernel: sd 10:0:0:0: [sdl] CDB: 
Jan  1 11:57:59 Tower kernel: cdb[0]=0x88: 88 00 00 00 00 00 08 36 1c 48 00 00 04 00 00 00
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137761800
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137761808
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137761816
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137761824
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137761832
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137761840
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137761848
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137761856
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137761864
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137761872
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137761880
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137761888
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137761896
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137761904
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137761912
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137761920
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137761928
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137761936
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137761944
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137761952
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137761960
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137761968
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137761976
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137761984
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137761992
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762000
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762008
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762016
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762024
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762032
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762040
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762048
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762056
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762064
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762072
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762080
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762088
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762096
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762104
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762112
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762120
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762128
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762136
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762144
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762152
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762160
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762168
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762176
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762184
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762192
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762200
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762208
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762216
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762224
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762232
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762240
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762248
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762256
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762264
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762272
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762280
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762288
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762296
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762304
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762312
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762320
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762328
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762336
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762344
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762352
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762360
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762368
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762376
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762384
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762392
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762400
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762408
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762416
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762424
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762432
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762440
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762448
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762456
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762464
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762472
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762480
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762488
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762496
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762504
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762512
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762520
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762528
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762536
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762544
Jan  1 11:57:59 Tower kernel: sd 10:0:0:0: [sdl] Unhandled error code
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762552
Jan  1 11:57:59 Tower kernel: sd 10:0:0:0: [sdl]  
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762560
Jan  1 11:57:59 Tower kernel: Result: hostbyte=0x04 driverbyte=0x00
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762568
Jan  1 11:57:59 Tower kernel: sd 10:0:0:0: [sdl] CDB: 
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762576
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762584
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762592
Jan  1 11:57:59 Tower kernel: cdb[0]=0x88: 88 00 00 00 00 00 08 36 20 48 00 00 02 38 00 00
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762600
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762608
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762616
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762624
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762632
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762640
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762648
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762656
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762664
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762672
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762680
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762688
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762696
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762704
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762712
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762720
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762728
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762736
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762744
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762752
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762760
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762768
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762776
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762784
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762792
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762800
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762808
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762816
Jan  1 11:57:59 Tower kernel: sd 10:0:0:0: [sdl] Unhandled error code
Jan  1 11:57:59 Tower kernel: sd 10:0:0:0: [sdl]  
Jan  1 11:57:59 Tower kernel: Result: hostbyte=0x04 driverbyte=0x00
Jan  1 11:57:59 Tower kernel: sd 10:0:0:0: [sdl] CDB: 
Jan  1 11:57:59 Tower kernel: cdb[0]=0x88: 88 00 00 00 00 00 08 36 22 80 00 00 01 c8 00 00
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762824
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762832
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762840
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762848
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762856
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762864
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762872
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762880
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762888
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762896
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762904
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762912
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762920
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762928
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762936
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762944
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762952
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762960
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762968
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762976
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762984
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137762992
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137763000
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137763008
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137763016
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137763024
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137763032
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137763040
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137763048
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137763056
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137763064
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137763072
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137763080
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137763088
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137763096
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137763104
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137763112
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137763120
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137763128
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137763136
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137763144
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137763152
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137763160
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137763168
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137763176
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137763184
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137763192
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137763200
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137763208
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137763216
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137763224
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137763232
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137763240
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137763248
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137763256
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137763264
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137763272
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137763280
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137763288
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137763296
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137763304
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137763312
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137763320
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137763328
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137763336
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137763344
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137763352
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137763360
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137763368
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137763376
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137763384
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137763392
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137763400
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137763408
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137763416
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137763424
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137763432
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137763440
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137763448
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137763456
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137763464
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137763472
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137763480
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137763488
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137763496
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137763504
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137763512
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137763520
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137763528
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137763536
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137763544
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137763552
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137763560
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137763568
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137763576
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137763584
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137763592
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137763600
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137763608
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137763616
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137763624
Jan  1 11:57:59 Tower kernel: md: disk8 read error, sector=137763632

  • Author

so looking at the gui, it is showing only 10 writes to drive 8, would this mean it isn't correcting bits on drive 8?  Is unraid trying to read bits from drive 8 to rebuild drive 6?  Is drive 6 going to be corrupt?

http://lime-technology.com/wiki/index.php?title=Troubleshooting#Capturing_your_syslog

 

so looking at the gui, it is showing only 10 writes to drive 8, would this mean it isn't correcting bits on drive 8?  Is unraid trying to read bits from drive 8 to rebuild drive 6?  Is drive 6 going to be corrupt?

 

Thats exactly how the system works.  It reads from all the working drives (including parity) to recalculate the information for disk 6.  Not sure at this point about corruption however.

 

 

I've been using unraid for just over a year.  It has been a great experience with the exception of my stupid choice to use seagate drives... I've lost about 4 or so in just under a year... Time to switch to wd red I think. 

 

HGST's are the most reliable. Look at the most recent backblaze study (sticky I created in the hard drives subforum).

 

I woke up this morning to an email showing that my number 6 drive had failed.  Stopped the array, rebooted the system hoping the drive would come back.  No such luck. 

 

It should NEVER come back under this circumstance. Once a drive is red-balled, it will never become un-red-balled unless  you take some action (or if there is a bug). If it did magically come back, it would be a VERY bad thing as parity would be out-of-sync with your drives.

 

Found the offensive drive in my rig and replaced it with another drive.  Logged into unraid and verified that all the other drives were fine, had the blue dot on my drive six replacement, started the array and started the rebuild process.

 

So the current state is that the old drive 6 is outside the array (in your hands), and the new drive 6 is being rebuilt?

 

I just checked on the rebuild completion status and noticed now that disk 8 has * for temperature and 171141458 errors now as well.  Im am thoroughly afraid of losing data.  Advice on how to proceed is greatly appreciated from this great community.

 

It is very uncommon (I have never seen it) that a drive dies like a light bulb - blink and its gone. It typically starts showing signs. More often than not a red-ball is a cabling issue. I suspect that the disk6 in your hands is fine. However, it did red-ball, and once that happens, any writes to "disk6" update a "simulated disk6". So the physical disk6 contents and the simulated disk6 contents are now out of sync. How far out of sync? It depends on how long ago the disk red-balled and how much data you copied to simulated disk6. If you don't recover the simulated disk6, any writes done to disk6 since the red-ball are lost.

 

So now we have disk8 spewing errors. This likely means that disk8's cable got knocked loose (as Squid said). I can almost guarantee that you do not have drive cages. Your story is the poster child for why every array needs them!

 

But here is what you have to do.

1 - Stop the rebuild

2 - Stop the array

3 - Backup the config folder from your flash drive (this has to be done with the server offline)

4 - Shut down the server (so it powers down)

5 - VERY CAREFULLY open the server and secure both sides of the disk8 cable without knocking anything else loose. This is not easy, but take your time and do your best. The backup we took at step 3 will enable you to retry should you not be successful in getting all the drives connected.

6 - Power up

7 - Unassign slot 6 (if assigned)

8 - Start the array

9 - Examine the simulated disk6. See if it looks good. If not, post back

10 - Stop array, assign slot 6 to your new disk6 (your original disk6 is still in your hands)

11 - Start the array and rebuild of disk6

 

Disk6 should rebuild using the contents of parity and the other disks in the array. It should look exactly like the simulated disk6 you looked at at step 9.

  • Author
So the current state is that the old drive 6 is outside the array (in your hands), and the new drive 6 is being rebuilt?

 

Yes, I physically removed the old disk six and it is sitting on a shelf.  The new drive 6 was what was being rebuilt.

 

How far out of sync? It depends on how long ago the disk red-balled and how much data you copied to simulated disk6. If you don't recover the simulated disk6, any writes done to disk6 since the red-ball are lost.

 

I personally have not copied anything to my server today.  The disk redballed this morning and I have not copied anything to the server.

 

Per your directions, I have since stopped the array and the rebuild of drive 6.  I will copy the flash drive onto my desktop as a backup, and do my best to fix any cabling issues on drive 8.  If i make it to your step 10, will the rebuild of drive 6 start from the beginning again using correct data from drive 8 if it is now connected properly or will it try to rebuild where it left off.

 

Thank you both for your help.

Rebuild will occur from the beginning.

 

Although likely the data cable is the culprit, it could also be the power cable that nudged loose. Check both.

  • Author

rebuild is at 5%, drive 8 is showing no errors.  fingers crossed!  Time to look at a new case with cages.  I will look at your advise on drives as well.  Thanks!

  • Author

as far as I could tell. There were mostly iso's and jpgs that would load.

as far as I could tell. There were mostly iso's and jpgs that would load.

 

I suspect all will be well.

 

The disk6 in your hand (that you pulled from the server) could be used to do a file by file md5 comparison to ensure all files match, or you could just spot check some files.

 

I would especially focus on the last several gigabytes of files copied to disk6. If any files are corrupted, it is likely one of them.

 

Good luck!

  • Author

That is a good idea to do a checksum comparison.  I will do that thanks!

If you find any that mismatch, DO NOT DELETE EITHER FILE and let me know.

 

In order to have both drives in the server you would need to mount thie disk outside the array. If you have an extra SATA port, I can explain how to mount it to do your compare.

 

Rebuild still working ok?

  • Author

Rebuilding is still going. no errors visible.  I had hopes that I'd be able to mount the failed drive 6 using a usb adapter to another pc but from my reading the past half hour that doesn't seem possible but rather mounting outside the array as you suggest is what will have to be done.

You may be able to mount the disk in the unRAID server via the USB adapter. Start a thread and maybe someone can help. I once precleared a USB-mounted disk.

 

Mounting it in the server itself is only complicated because you have to open your server to connect it up. If you were able to put it in a drive cage, the commands to mount it are simple. (Sorry to rub it in ;))

  • Author

still rebuilding.  I'm kind of curious now as to what to do with the failed disk 6 i have (after i verify files) as well as another failed disk i have.  (both were simply random red dots one day)  I will try a few runs of preclear on them and see if they're fine.  Does that sound like a bad move?

still rebuilding.  I'm kind of curious now as to what to do with the failed disk 6 i have (after i verify files) as well as another failed disk i have.  (both were simply random red dots one day)  I will try a few runs of preclear on them and see if they're fine.  Does that sound like a bad move?

 

First check the SMART reports, which takes no time. No use preclearing them if they already have lots of problems. But if the SMART report looks ok, preclearing is the right next step.

 

Feel free to post the reports and someone can let you know if anything looks concerning.

  • Author

So it seems that I could mount an unraid drive over usb on a windows box using a driver or program however it seems these do not allow for writing to the drive (which my md5 program will want to do when creating the hash file)  It looks like mounting the drive outside the array would be a better option.  Do you recommend SNAP?

mdsum will work and it writes it's output to the console, which you could pipe to a file on a writable disk. There is a free version I found for Windows that works just like the Linux version that comes with unRaid.

 

I have never used SNAP, but others have had success with it. I can't be much help as I am out of town this weekend with no access to my server.

Archived

This topic is now archived and is closed to further replies.

Account

Navigation

Search

Search

Configure browser push notifications

Chrome (Android)
  1. Tap the lock icon next to the address bar.
  2. Tap Permissions → Notifications.
  3. Adjust your preference.
Chrome (Desktop)
  1. Click the padlock icon in the address bar.
  2. Select Site settings.
  3. Find Notifications and adjust your preference.