SOLVED: Strange Parity Check/Rebuild Error

April 18, 20188 yr

I'm new to unRAID but have read documentation and am continually scratching my head about this issue. Just bought a pro license yesterday but am very concerned about what's been happening.

So, this all started when I started a parity check one evening and woke up the next morning to my Parity 1 drive moved to "Unassigned Devices", but still also listed as a spun-up Parity drive but reading 22 trillion read and write errors. This drive has 32 days of uptime and is brand new, and given the information at the time I thought this meant my 2nd Parity drive was bad given its already questionable SMART status, so I decided to order a new drive and replace it. I stopped everything, powered off, checked connections, powered on, and found that my unRAID flash drive had become corrupted and/or the filesystem was totally wrecked. I replaced this with a brand new flash drive, was able to recover my config and copy it over, and boot back into unRAID to find that 2 DATA drives were now labeled as "missing", and did not show up in terminal using lshw -c disk. Parity 1 was also unassigned, but was visible so I reassigned it. I powered off again, checked connections again, and rebooted to find one data disk to still not work. Easy enough to deal with, I thought, since that one data disk didn't have any data on it. I replaced that data disk, and now I'm back to pretty much the same scenario from the start:

I am doing a parity rebuild/check with a NEW 2nd Parity drive, and a NEW data drive, and the rebuild/check has run all day until now without issue, but as it approached 1.91TB in the parity check (which it should be stopping at since the data drives in my array all have a writable volume of 1.81TB) , Parity 1 suddenly listed 22 trillion read and write errors again, now shows no SMART temperature, and is listed when you drill down as a SPUN DOWN drive. Mind you, this parity check is still RUNNING as we speak, and reports no issues, and the system logs show absolutely nothing. It's completely silent.

I've checked all drives for SMART status, checked my drive configuration, and am completely at a loss. Something is really messed up and I don't know what it is. This is running on a pre-owned but well-conditioned SuperMicro X9SRH-7F and E5-2630Lv1 with 96GB of Samsung ECC DDR3-1066MHz RDIMMs. This problem seems to keep occurring at a particular point, and is not intermittent which leads me to believe it's not a hardware issue. See photos for more info on how my array is situated and the current status.

Edited April 20, 20188 yr by Distinguished

Quote

April 18, 20188 yr

Community Expert

Almost certainly hardware issue. Don't reboot. Go to Tools - Diagnostics and post complete zip.

Quote

April 18, 20188 yr

Author

Logs attached here. Thanks for the speedy reply. unRAID is still doing a parity check which is presumably borked.

redacted-diagnostics-20180418-1452.zip

Quote

April 18, 20188 yr

Your parity disk was originally given the identifier /dev/sdm but it dropped off line and reconnected again as /dev/sdp. It looks like it has an intermittent connection so shut down and check both its SATA and power cables. If you're using some sort of drive cage then put it in a different bay.

I'm not sure why you think your syslog is not showing any errors when it's full of this:

Apr 18 07:30:02 jpus001-pgh1 kernel: md: disk0 read error, sector=27516368
Apr 18 07:30:02 jpus001-pgh1 kernel: md: disk0 read error, sector=27516376
Apr 18 07:30:02 jpus001-pgh1 kernel: md: disk0 read error, sector=27516384

Quote

April 18, 20188 yr

Author

I did see that in the logs but didn't think much of it since I've been seeing a lot of that lately. Thanks for the suggestion, I'll give it a shot.

Quote

April 19, 20188 yr

8 hours ago, Distinguished said:

22 trillion read and write errors

3 hours ago, Distinguished said:

I've been seeing a lot of that lately

The two are the same thing. (Actually, it's "only" 40 million errors but even a single one is unacceptable.) Disk0 refers to the (first) Parity disk, BTW. Do you have notifications enabled (Settings -> Notification Settings)?

A couple of other pointers: your server was building Parity2 (not checking parity) and rebuilding Disk2. That's fine, it can do both simultaneously but in order to do so it needs to be able to read all the other disks so when Parity1 (just labelled Parity) fell off line both processes failed. The building of Parity2 won't stop when it reaches the 2 TB mark but will continue (filling the remainder of the disk with zeros) all the way to 3 TB. The other disks will no longer be involved in the calculation at the point and, depending on your settings, will probably spin down. I suggest you uninstall the Preclear plugin, at least until you've fixed this problem, because it spams the syslog and makes it difficult to read.

Edited April 19, 20188 yr by John_M
typo

Quote

April 19, 20188 yr

Not to hijack the thread, but I replaced a failing drive (1.5TB or so) with a new 8TB drive. It is rebuilding the new drive, and in the process game read errors on the first disk (disk 1 - 8TB) and a different disk (disk 5) (1TB, 5+ years old). I'm not sure if these read errors were corrected?

Edited April 19, 20188 yr by daze

Quote

April 19, 20188 yr

Community Expert

1 hour ago, daze said:

Not to hijack the thread, but I replaced a failing drive (1.5TB or so) with a new 8TB drive. It is rebuilding the new drive, and in the process game read errors on the first disk (disk 1 - 8TB) and a different disk (disk 5) (1TB, 5+ years old). I'm not sure if these read errors were corrected?

Start your own thread and post your diagnostics.

Quote

April 20, 20188 yr

Author

I swapped power and sata cables and moved the Parity 1 drive to a different SAS port, started a new config, and was able to build full parity without issue. Everything appears to be running fine. Thanks everyone!

Quote

SOLVED: Strange Parity Check/Rebuild Error

Featured Replies

Archived

Account

Navigation

Search

Configure browser push notifications

Chrome (Android)

Chrome (Desktop)

Safari (iOS 16.4+)

Safari (macOS)

Edge (Android)

Edge (Desktop)

Firefox (Android)

Firefox (Desktop)