Parity check errors


Recommended Posts

Hello forum members!

 

I'm using version 4.5.4.

 

I got some errors during monthly parity check:

 

Jan 1 00:00:02 Tower kernel: mdcmd (184090): check

Jan 1 00:00:02 Tower kernel: md: recovery thread woken up ...

Jan 1 00:00:02 Tower kernel: md: recovery thread checking parity...

Jan 1 00:00:02 Tower kernel: md: using 1152k window, over a total of 976762552 blocks.

Jan 1 00:39:51 Tower kernel: ata1.00: exception Emask 0x12 SAct 0x0 SErr 0x1000500 action 0x6

Jan 1 00:39:51 Tower kernel: ata1.00: BMDMA stat 0x5

Jan 1 00:39:51 Tower kernel: ata1: SError: { UnrecovData Proto TrStaTrns }

Jan 1 00:39:51 Tower kernel: ata1.00: failed command: READ DMA EXT

Jan 1 00:39:51 Tower kernel: ata1.00: cmd 25/00:08:a7:a4:8c/00:01:0f:00:00/e0 tag 0 dma 135168 in

Jan 1 00:39:51 Tower kernel: res 51/84:38:77:a5:8c/84:00:0f:00:00/e0 Emask 0x12 (ATA bus error)

Jan 1 00:39:51 Tower kernel: ata1.00: status: { DRDY ERR }

Jan 1 00:39:51 Tower kernel: ata1.00: error: { ICRC ABRT }

Jan 1 00:39:51 Tower kernel: ata1: hard resetting link

Jan 1 00:39:51 Tower kernel: ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 310)

Jan 1 00:39:51 Tower kernel: ata1.00: configured for UDMA/133

Jan 1 00:39:51 Tower kernel: ata1: EH complete

 

It is also reported that:

 

Parity is Valid:.  Last parity check 3 days ago with no sync errors.

 

Can someone tell me what these errors mean and if I should worry about them?

 

Thanks so much.

Link to comment

Hello forum members!

 

I'm using version 4.5.4.

 

I got some errors during monthly parity check:

 

Jan 1 00:00:02 Tower kernel: mdcmd (184090): check

Jan 1 00:00:02 Tower kernel: md: recovery thread woken up ...

Jan 1 00:00:02 Tower kernel: md: recovery thread checking parity...

Jan 1 00:00:02 Tower kernel: md: using 1152k window, over a total of 976762552 blocks.

Jan 1 00:39:51 Tower kernel: ata1.00: exception Emask 0x12 SAct 0x0 SErr 0x1000500 action 0x6

Jan 1 00:39:51 Tower kernel: ata1.00: BMDMA stat 0x5

Jan 1 00:39:51 Tower kernel: ata1: SError: { UnrecovData Proto TrStaTrns }

Jan 1 00:39:51 Tower kernel: ata1.00: failed command: READ DMA EXT

Jan 1 00:39:51 Tower kernel: ata1.00: cmd 25/00:08:a7:a4:8c/00:01:0f:00:00/e0 tag 0 dma 135168 in

Jan 1 00:39:51 Tower kernel: res 51/84:38:77:a5:8c/84:00:0f:00:00/e0 Emask 0x12 (ATA bus error)

Jan 1 00:39:51 Tower kernel: ata1.00: status: { DRDY ERR }

Jan 1 00:39:51 Tower kernel: ata1.00: error: { ICRC ABRT }

Jan 1 00:39:51 Tower kernel: ata1: hard resetting link

Jan 1 00:39:51 Tower kernel: ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 310)

Jan 1 00:39:51 Tower kernel: ata1.00: configured for UDMA/133

Jan 1 00:39:51 Tower kernel: ata1: EH complete

 

It is also reported that:

 

Parity is Valid:.   Last parity check 3 days ago with no sync errors.

 

Can someone tell me what these errors mean and if I should worry about them?

 

Thanks so much.

 

The disk error is a ICRC ABRT (checksum) error in communicating with the drive.  It could indicate a poor quality SATA cable, or a noisy power supply, or an SATA cable bundled too closely to a noisy power supply line, or something else.  The OS reacted by resetting the communications to the drive and trying again.  Not much you can do about it unless it continues. 

 

The "Parity check completed" message is just informational.  It indicates your array is basically keeping parity just fine and the monthly check you ran on the first of the month was completed without finding any errors.

 

Joe L.

Link to comment

I find those in my syslog from time to time. In my case, I will typically see 2 or 3 of those resets involving multiple drives grouped together with close times. Then, it's good for a few days before it happens again. I think it's my 1T Seagate drives but I'm not sure and haven't really worried enough to check into it further. My server has been running that way for a couple of years now without any other complications due to it. So, keep an eye on those resets and only worry about them if it occurs mutliple times in a row for the same drive.

 

Peter

 

Link to comment

Good catch Peter,

 

I wondered about this myself but wasn't concerned about it as much as those errors. I have no idea how I can enable full 3Gbps rate. I will double check BIOS, but besides that I don't know what needs to be done as there are no jumpers on new drives any more.

 

Thanks for your reply.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.