MyMain Smart Errors, Need I be Worried?

September 8, 201114 yr

Having some random reboots, and not real sure why... doubt specifically related to HD's but wanted to get opinion if I should worry about these errors (or which are most worry some). I have replaced all SATA Cables with brand new locking ones about 6 months or so ago. My MyMain Smart Screenshot is as below!

My Syslog indicate some errors but I'm not 100% which drive is being reported (Full Syslog file attached)

Sep 7 17:25:04 Media kernel: ata11.00: exception Emask 0x10 SAct 0x1 SErr 0x780100 action 0x6 (Errors)

Sep 7 17:25:04 Media kernel: ata11.00: irq_stat 0x08000000 (Drive related)

Sep 7 17:25:04 Media kernel: ata11: SError: { UnrecovData 10B8B Dispar BadCRC Handshk } (Errors)

Sep 7 17:25:04 Media kernel: ata11.00: failed command: READ FPDMA QUEUED (Minor Issues)

Sep 7 17:25:04 Media kernel: ata11.00: cmd 60/08:00:00:00:00/00:00:00:00:00/40 tag 0 ncq 4096 in (Drive related)

Sep 7 17:25:04 Media kernel: res 40/00:00:00:00:00/00:00:00:00:00/40 Emask 0x10 (ATA bus error) (Errors)

Sep 7 17:25:04 Media kernel: ata11.00: status: { DRDY } (Drive related)

Sep 7 17:25:04 Media kernel: ata11: hard resetting link (Minor Issues)

Sep 7 17:25:04 Media kernel: ata11: SATA link up 3.0 Gbps (SStatus 123 SControl 300) (Drive related)

Sep 7 17:25:04 Media kernel: ata11.00: configured for UDMA/133 (Drive related)

Sep 7 17:25:04 Media kernel: ata11: EH complete (Drive related)

Sep 7 17:25:04 Media kernel: ata11: limiting SATA link speed to 1.5 Gbps (Drive related)

Sep 7 17:25:04 Media kernel: ata11.00: exception Emask 0x0 SAct 0x1 SErr 0x980000 action 0x6 frozen (Errors)

Sep 7 17:25:04 Media kernel: ata11: SError: { 10B8B Dispar LinkSeq } (Errors)

Sep 7 17:25:04 Media kernel: ata11.00: failed command: READ FPDMA QUEUED (Minor Issues)

Sep 7 17:25:04 Media kernel: ata11.00: cmd 60/08:00:00:00:00/00:00:00:00:00/40 tag 0 ncq 4096 in (Drive related)

Sep 7 17:25:04 Media kernel: res 40/00:00:00:00:00/00:00:00:00:00/40 Emask 0x4 (timeout) (Errors)

9-7-2011%2525205-29-19%252520PM.jpg

syslog-2011-09-07.txt

Quote

September 8, 201114 yr

Looking at your smart view, you have some serious drive issues going on:

More Serious

disk7 - reallocated_sector_ct=181

disk12 - reallocated_sector_ct=70

disk14 - reallocated_sector_ct=39

Less Serious

disk4 - reallocated_sector_ct=9

disk5 - current_pending_sector=3

disk6 - reported_uncorrect=9 / ata_error_count=9

disk9 - reallocated_sector_ct=1

disk13 - reallocated_sector_ct=1

disk16 - HPA?

disk18 - HPA?

Reallocated Sectors / Current_Pending_Sectors

---------------------------------------------

Reallocated sectors indicate that the drive has detected that a sector is flawed and not able to store data, so it has mapped a spare sector in its place. This is a good thing. Problem is often once sectors start going bad and being remapped, more sectors go bad, and the next thing you know all of the spare sectors are used up and the drive is toast.

But sometimes a few bad sectors are just a few bad sectors, and subsequent reallocations do not occur. If this is the case the drive is fine.

A current_pending_sector is a sector that has been identified for future reallocation. Normally these sectors get reallocated as the drive is used.

You need to run some parity checks and see if the number of reallocated sectors and/or pending sectors increase with each check. If you can't run 3 parity checks in a row and have the reallocated sectors not increase on a drive, you should RMA that drive.

Ata_Error_Count and syslog errors

----------------------------------

There errors usually indicate some type of cabling issue. I would resecure the cables on this drive.

The ata11 from your syslog is actually your cache disk.

HPA

---

HPA is a not a disk error, but indicates that the BIOS has used a trick to reduce (very slightly) the size of your disk and used it to keep a backup of your BIOS settings. For technical reasons, HPAs can cause problem with newer versions of unRAID.

Fortunately, I don't believe you actually have HPAs - this looks like a false positive. myMain does not have a reference value for a 360G drive (I have actually never heard of this size before). So I would not worry about these unless you are using a Gigabyte motherboard.

If you send me a screenshot of the "Details" myMain view, I can add the reference value for a 360G drive and you will not get the false positives in the future.

Quote

September 8, 201114 yr

Author

Thanks for the reply bjp999... Details screenshot sent... the 360GB's I think were an OEM drives... got these from a friend... They are on the next to be updated list (unless I need to replace some of the failures for the larger drives first).

I'm not sure how long the ATA_Error Counts have been going on... it's possible/likely that this was occurring before the last SATA Cable swap (is there a way to determine a date/time stamp for the errors).

I'll run through a few parity checks over the next few days and see if any of my reallocated sector #'s increase...

Quote

September 8, 201114 yr

Author

I started my first Partiy Check and it started puking all over ATA4 (and I hear clicking)... how exactly do I find which drive this is?

Sep 7 21:44:21 Media kernel: res 40/00:58:47:5b:f3/00:00:00:00:00/40 Emask 0x10 (ATA bus error) (Errors)