Parity/Read-Check Errors

May 5, 20206 yr

Hello everyone.

My unRaid sever has started to report parity errors. The first errors came on 2020-02-02 (1514 errors), then no errors until 2020-02-23 (4 errors). Since 2020-03-22 I have been getting errors frequently as seen in the screenshots. There might have been an unclean shutdown before 2020-03-22, but since then I have restarted the server a few times and not had unclean shutdowns. I got 626.219 errors on 2020-05-03, quite a lot I think, ran the check again and got 3.643 errors (2020-05-04).

I have attached the diagnostics. Thanks in advance.

tower-diagnostics-20200505-2319.zip

Quote

May 6, 20206 yr

Community Expert

On mobile now so can't look at Diagnostics yet.

Have you done memtest?

Quote

May 6, 20206 yr

Community Expert

Do a couple of consecutive parity checks without rebooting and post new diags, but first you need to fix this error spamming the log (an then reboot):

Apr 28 04:43:38 Tower nginx: 2020/04/28 04:43:38 [error] 3684#3684: *1298377 connect() to unix:/var/tmp/HomeAssistantCore.sock failed (111: Connection refused) while connecting to upstream, client: 192.168.1.157, server: , request: "GET /dockerterminal/HomeAssistantCore/token HTTP/1.1", upstream: "http://unix:/var/tmp/HomeAssistantCore.sock:/token", host: "tower", referrer: "http://tower/dockerterminal/HomeAssistantCore/"
Apr 28 04:44:28 Tower nginx: 2020/04/28 04:44:28 [error] 3684#3684: *1298470 connect() to unix:/var/tmp/HomeAssistantCore.sock failed (111: Connection refused) while connecting to upstream, client: 192.168.1.157, server: , request: "GET /dockerterminal/HomeAssistantCore/ws HTTP/1.1", upstream: "http://unix:/var/tmp/HomeAssistantCore.sock:/ws", host: "tower"

Quote

May 6, 20206 yr

Author

16 hours ago, johnnie.black said:

Do a couple of consecutive parity checks without rebooting and post new diags, but first you need to fix this error spamming the log (an then reboot):


Apr 28 04:43:38 Tower nginx: 2020/04/28 04:43:38 [error] 3684#3684: *1298377 connect() to unix:/var/tmp/HomeAssistantCore.sock failed (111: Connection refused) while connecting to upstream, client: 192.168.1.157, server: , request: "GET /dockerterminal/HomeAssistantCore/token HTTP/1.1", upstream: "http://unix:/var/tmp/HomeAssistantCore.sock:/token", host: "tower", referrer: "http://tower/dockerterminal/HomeAssistantCore/"
Apr 28 04:44:28 Tower nginx: 2020/04/28 04:44:28 [error] 3684#3684: *1298470 connect() to unix:/var/tmp/HomeAssistantCore.sock failed (111: Connection refused) while connecting to upstream, client: 192.168.1.157, server: , request: "GET /dockerterminal/HomeAssistantCore/ws HTTP/1.1", upstream: "http://unix:/var/tmp/HomeAssistantCore.sock:/ws", host: "tower"

This was only spamming the logs on Apr 28. So there should be two consecutive parity checks after the spam on Mar 3 and 4.

But if that isn't correct should I start with rebooting and doing two parity checks or run a memtest?

Quote

May 7, 20206 yr

Community Expert

Because of the spam the log goes from:

May  3 07:08:39 Tower kernel: ata2.00: status: { DRDY }
May  3 07:08:39 Tower kernel: ata2.00: failed command: READ FPDMA QUEUED
May  3 07:08:39 Tower kernel: ata2.00: cmd 60/40:88:c8:b6:5e/00:00:3d:00:00/40 tag 17 ncq dma 32768 in
May  3 07:08:39 Tower kernel:         res 40/00:58:c8:9c:5e/00

to

May  4 04:40:14 Tower rsyslogd: [origin software="rsyslogd" swVersion="8.1908.0" x-pid="1416" x-info="https://www.rsyslog.com"] rsyslogd was HUPed
May  4 05:00:01 Tower crond[1613]: exit status 1 from user root /usr/local/sbin/mover &> /dev/null

And it's missing most of the first check, not showing any corrections, it does show a lot of errors on ATA2 (disk1) so also replace cables on that disk and do the 2 consecutive checks.

Quote

May 9, 20206 yr

Author

On 5/7/2020 at 7:53 AM, johnnie.black said:

And it's missing most of the first check, not showing any corrections, it does show a lot of errors on ATA2 (disk1) so also replace cables on that disk and do the 2 consecutive checks.

Hi again. I repleced the cables for disk 1 and ran two checks, getting 0 errors both times but I got UDMA CRC error count smart errors on three drives (parity, disk1 and disk3). I have gotten these errors before but not for a few months. Any ideas what could cause this?

Quote

May 9, 20206 yr

Community Expert

8 minutes ago, calmDown said:

UDMA CRC error count smart errors on three drives (parity, disk1 and disk3). I have gotten these errors before but not for a few months. Any ideas what could cause this?

Usually caused by bad connections or cables. It basically means the data became corrupted between the disk and the rest of the system.

New diagnostics might give some clue assuming you haven't rebooted.

Quote

May 9, 20206 yr

Author

46 minutes ago, trurl said:

Usually caused by bad connections or cables. It basically means the data became corrupted between the disk and the rest of the system.

New diagnostics might give some clue assuming you haven't rebooted.

tower-diagnostics-20200509-1304.zip I have all my drives connected to the motherboard, could the motherboard be the problem?

Quote

May 9, 20206 yr

Community Expert

9 minutes ago, calmDown said:

I have all my drives connected to the motherboard, could the motherboard be the problem?

It could be, you're still getting ATA errors on 3 disks, if replacing the SATA cables doesn't fix them it could be, try replacing those first, make sure they are good quality cables.

Quote

Parity/Read-Check Errors

Featured Replies

Archived

Account

Navigation

Search

Configure browser push notifications

Chrome (Android)

Chrome (Desktop)

Safari (iOS 16.4+)

Safari (macOS)

Edge (Android)

Edge (Desktop)

Firefox (Android)

Firefox (Desktop)