Jump to content

Unraid 6.9.2 Parity check finished, 100 errors


Recommended Posts

Feb 19 13:23:12 tower emhttpd: unclean shutdown detected

will often result in a few sync errors, and the timestamp agrees with your history. The unclean shutdown triggered

Feb 19 13:24:23 tower kernel: mdcmd (36): check nocorrect
Feb 19 13:24:23 tower kernel: md: recovery thread: check P Q ...

 

You later ran a correcting parity check, but disk2 was having problems while that was happening.

Mar  1 04:00:01 tower kernel: mdcmd (37): check 
Mar  1 04:00:01 tower kernel: md: recovery thread: check P Q ...
Mar  1 04:00:09 tower kernel: ata5.00: exception Emask 0x50 SAct 0x3fc00 SErr 0x280901 action 0x6 frozen
Mar  1 04:00:09 tower kernel: ata5.00: irq_stat 0x0c000000, interface fatal error
Mar  1 04:00:09 tower kernel: ata5: SError: { RecovData UnrecovData HostInt 10B8B BadCRC }
Mar  1 04:00:09 tower kernel: ata5.00: failed command: READ FPDMA QUEUED
Mar  1 04:00:09 tower kernel: ata5.00: cmd 60/00:50:40:00:00/04:00:00:00:00/40 tag 10 ncq dma 524288 in
Mar  1 04:00:09 tower kernel:         res 40/00:88:40:1c:00/00:00:00:00:00/40 Emask 0x50 (ATA bus error)
Mar  1 04:00:09 tower kernel: ata5.00: status: { DRDY }
....and more
Mar  1 04:00:28 tower kernel: md: disk2 read error, sector=7168
Mar  1 04:00:28 tower kernel: md: disk2 read error, sector=7176
Mar  1 04:00:28 tower kernel: md: disk2 read error, sector=7184
...and more 

Check connections on disk2 then disable spindown on disk2 and run an extended SMART test. If that passes you need to run another correcting check.

 

Then a non-correcting parity check to verify. The only acceptable result is exactly zero sync errors so you have been in an unacceptable state for several weeks now.

 

You should have seen the I/O errors for disk2 in the Errors column on Main - Array Devices. Check to make sure you aren't still getting them when you correct parity again.

Link to comment

I can't see any more errors on disk2, the count in the Errors column is 1024 (suspiciously round number).

I have tried some random reads of files from that disk - no errors generated, but then it doesn't tell anything about the disk, it's only a "weak" indication that cables are OK (I have reconnected all connectors).

I started extended self-test, we'll see.

Are these 100 errors fully correctable, or do I have to worry that 100 sectors might be gone forever?

 

The setup is old (well, it was new in 2012 ;-) ), so probably time to refresh. Biggest problem - don't want to change all disks at the same time to avoid risk of being in the same place of the bathtub curve with the whole storage. I only ever had one disk fail in this setup - 1ST500LM021-1KJ152_W621GQWL - 500 GB (sdj), which annoyingly was the only cache at that time. Thankfully it wasn't too difficult to rebuild VMs and dockers.

 

Link to comment
35 minutes ago, Jerry1111 said:

Are these 100 errors fully correctable, or do I have to worry that 100 sectors might be gone forever?

The read errors on disk2 were due to bad connection. If that is fixed then those sectors should be read next time parity is corrected, assuming nothing actually wrong with disk2 of course.

Link to comment
  • 4 weeks later...

Sorry for reporting back late - got distracted with house stuff.

I took the server out, cleaned and re-seated all of the connectors. Ran extended smart test, followed by read-only parity - everything is back to normal. Many thanks for help.

Given my random collection of old disks (and this scare!) it's probably time to slowly start to swap the disks for the new ones. Probably I have to do it slowly, to avoid all of the new disks falling into the same valley of the failure bath-curve. On the other hand - if it ain't broke...

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...