Spate of recent UDMA CRC errors after hardware change


fitbrit

Recommended Posts

Hi all

 

I recently upgraded the guts of my server with a new mobo, CPU and RAM. I went from a socket 775 to a 4th gen i7 for my NORCO 4224. I have 8 drives connected via SATA 3 on the motherboard - parity 1, parity 2 and disks 1-6. The other 16 drives are connected via Marvel controllers, which I know are no longer recommended. They are not giving me any issues right now.

What I am having trouble with is parity 2, disks 2, 3 and 5 - these are all disks connected to the motherboard SATA ports. Recently a spate of UDMA CRC errors have occurred - I would say about 100 over the past 6 days. They started soon after the hardware change.
What I would like to know is how serious they are. I have invested in new reverse breakout cables, and these will be installed when they arrive. I am wondering whether it could be the motherboard SATA ports that are bad if the new cables do not fix the issues. Since the errors began, I have installed a new parity disk, added a second parity disk, and am currently increasing the size of one of the disks. Should I be concerned that may array now has erroneous data on it, or are these errors self correcting?

Thanks for any light that can be shed.

 

Model: Custom
M/B: MSI - Z97 GAMING 7 (MS-7916)
CPU: Intel® Core™ i7-4770K CPU @ 3.50GHz
HVM: Enabled
IOMMU: Disabled
Cache: 256 kB, 1024 kB, 8192 kB
Memory: 32 GB (max. installable capacity 32 GB)
Network: eth0: 1000 Mb/s, full duplex, mtu 1500
Kernel: Linux 4.18.8-unRAID x86_64
OpenSSL: 1.1.0i
Uptime: 

Sep 26 12:09:00 MEDIASERVER root: Fix Common Problems Version 2018.09.08
Sep 26 12:09:05 MEDIASERVER root: Fix Common Problems: Warning: Marvel Hard Drive Controller Installed ** Ignored
Sep 26 12:09:08 MEDIASERVER ntpd[1864]: kernel reports TIME_ERROR: 0x41: Clock Unsynchronized
Sep 26 12:34:31 MEDIASERVER kernel: ata6.00: exception Emask 0x10 SAct 0x0 SErr 0x280100 action 0x6 frozen
Sep 26 12:34:31 MEDIASERVER kernel: ata6.00: irq_stat 0x08000000, interface fatal error
Sep 26 12:34:31 MEDIASERVER kernel: ata6: SError: { UnrecovData 10B8B BadCRC }
Sep 26 12:34:31 MEDIASERVER kernel: ata6.00: failed command: READ DMA EXT
Sep 26 12:34:31 MEDIASERVER kernel: ata6.00: cmd 25/00:00:a8:5c:0a/00:02:12:00:00/e0 tag 29 dma 262144 in
Sep 26 12:34:31 MEDIASERVER kernel: res 50/00:00:a7:5c:0a/00:00:12:00:00/40 Emask 0x10 (ATA bus error)
Sep 26 12:34:31 MEDIASERVER kernel: ata6.00: status: { DRDY }
Sep 26 12:34:31 MEDIASERVER kernel: ata6: hard resetting link
Sep 26 12:34:31 MEDIASERVER kernel: ata6: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
Sep 26 12:34:31 MEDIASERVER kernel: ata6.00: configured for UDMA/133
Sep 26 12:34:31 MEDIASERVER kernel: ata6: EH complete
Sep 26 13:15:40 MEDIASERVER kernel: ata2.00: exception Emask 0x10 SAct 0x0 SErr 0x280100 action 0x6 frozen
Sep 26 13:15:40 MEDIASERVER kernel: ata2.00: irq_stat 0x08000000, interface fatal error
Sep 26 13:15:40 MEDIASERVER kernel: ata2: SError: { UnrecovData 10B8B BadCRC }
Sep 26 13:15:40 MEDIASERVER kernel: ata2.00: failed command: READ DMA EXT
Sep 26 13:15:40 MEDIASERVER kernel: ata2.00: cmd 25/00:08:90:d3:d8/00:02:27:00:00/e0 tag 3 dma 266240 in
Sep 26 13:15:40 MEDIASERVER kernel: res 50/00:00:8f:d3:d8/00:00:27:00:00/e0 Emask 0x10 (ATA bus error)
Sep 26 13:15:40 MEDIASERVER kernel: ata2.00: status: { DRDY }
Sep 26 13:15:40 MEDIASERVER kernel: ata2: hard resetting link
Sep 26 13:15:40 MEDIASERVER kernel: ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
Sep 26 13:15:40 MEDIASERVER kernel: ata2.00: configured for UDMA/133
Sep 26 13:15:40 MEDIASERVER kernel: ata2: EH complete
Sep 26 13:20:30 MEDIASERVER kernel: ata2.00: exception Emask 0x10 SAct 0x0 SErr 0x280100 action 0x6 frozen
Sep 26 13:20:30 MEDIASERVER kernel: ata2.00: irq_stat 0x08000000, interface fatal error
Sep 26 13:20:30 MEDIASERVER kernel: ata2: SError: { UnrecovData 10B8B BadCRC }
Sep 26 13:20:30 MEDIASERVER kernel: ata2.00: failed command: READ DMA EXT
Sep 26 13:20:30 MEDIASERVER kernel: ata2.00: cmd 25/00:08:28:59:66/00:02:2a:00:00/e0 tag 2 dma 266240 in
Sep 26 13:20:30 MEDIASERVER kernel: res 50/00:00:27:59:66/00:00:2a:00:00/e0 Emask 0x10 (ATA bus error)
Sep 26 13:20:30 MEDIASERVER kernel: ata2.00: status: { DRDY }
Sep 26 13:20:30 MEDIASERVER kernel: ata2: hard resetting link
Sep 26 13:20:31 MEDIASERVER kernel: ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
Sep 26 13:20:31 MEDIASERVER kernel: ata2.00: configured for UDMA/133
Sep 26 13:20:31 MEDIASERVER kernel: ata2: EH complete
Sep 26 13:20:48 MEDIASERVER nginx: 2018/09/26 13:20:48 [error] 7533#7533: *26027 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: 192.168.2.16, server: , request: "POST /webGui/include/DeviceList.php HTTP/1.1", upstream: "fastcgi://unix:/var/run/php5-fpm.sock:", host: "192.168.2.10", referrer: "http://192.168.2.10/Main"
Sep 26 13:20:48 MEDIASERVER php-fpm[7479]: [WARNING] [pool www] child 2303 exited on signal 7 (SIGBUS) after 84.000580 seconds from start

 

Link to comment
5 minutes ago, fitbrit said:

I am wondering whether it could be the motherboard SATA ports that are bad if the new cables do not fix the issues.

Possible but unlikely, cables, then backplanes are the most likely culprits.

 

6 minutes ago, fitbrit said:

Should I be concerned that may array now has erroneous data on it, or are these errors self correcting?

They are corrected automatically, they will affect performance, and in some rare cases bad data can get through, but that usually only happens on extreme cases with thousands of errors during a very short time.

Link to comment

Many thanks!
The backplanes haven't changed in my hardware upgrade. But then again, neither have my cables. I will replace the cables anyway, since they are now quite old.

Only the motherboard ports are different, and also the upgrade to Unraid 6.6. Could it be that I had these errors all along, but prior to unraid 6.6, they were not being reported?

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.