Parity errors again (this time with syslog)

January 25, 201016 yr

Same situation, transferring files to the server.

Parity syncs with 0 errors everytime, however when i'm actually transfering data to the server it starts to produce errors overtime.

I see lots of errors and no idea what they mean.... so many errors that I have a hard time believing that this is just one device failing or just a simple bad SATA cable.

syslog-2010-01-25.zip

January 25, 201016 yr

Author

I was hoping to wake up to some replies.

I paid $1500 + unRAID Pro for this setup, and it's been nonstop headaches. 3 DOA drives, 1 mobo DOA, 3 delayed packages due to weather. I've never had this many issues. I have tons of data that I needed to have put on this server a month ago.

Could I please get some help on the syslog file?

January 25, 201016 yr

I was hoping to wake up to some replies.

I paid $1500 + unRAID Pro for this setup, and it's been nonstop headaches. 3 DOA drives, 1 mobo DOA, 3 delayed packages due to weather. I've never had this many issues. I have tons of data that I needed to have put on this server a month ago.

Could I please get some help on the syslog file?

Give us some time. You posted at about 2AM my time and you have to take that into account. I am currently at work, and while I do browser the forum, I don't have the time to look through a syslog right now.

Did you build the server yourself? or by from Limetech? If from Limetech then it was bad luck on the drives and motherboard!

Basically it comes down to stuff just hitting the fan. I have been going through the same thing with crap drives from Seagate for the last three months. I finally got them to send me a new (instead of refurb) and am going to preclear it tonight when I get home.

If I get a chance I will take a look through the syslog. Things you can check though are the SATA cables are connected well (and preferably use locking cables). Also check the power connection for each drive.

Smart reports on the drives might also help to determine if anything is wrong with them. There are instructions in the unRAID wiki.

January 25, 201016 yr

Author

I was hoping to wake up to some replies.

I paid $1500 + unRAID Pro for this setup, and it's been nonstop headaches. 3 DOA drives, 1 mobo DOA, 3 delayed packages due to weather. I've never had this many issues. I have tons of data that I needed to have put on this server a month ago.

Could I please get some help on the syslog file?

Give us some time. You posted at about 2AM my time and you have to take that into account. I am currently at work, and while I do browser the forum, I don't have the time to look through a syslog right now.

Did you build the server yourself? or by from Limetech? If from Limetech then it was bad luck on the drives and motherboard!

Basically it comes down to stuff just hitting the fan. I have been going through the same thing with crap drives from Seagate for the last three months. I finally got them to send me a new (instead of refurb) and am going to preclear it tonight when I get home.

If I get a chance I will take a look through the syslog. Things you can check though are the SATA cables are connected well (and preferably use locking cables). Also check the power connection for each drive.

Smart reports on the drives might also help to determine if anything is wrong with them. There are instructions in the unRAID wiki.

I built this myself, but i've built over 50 computers in my lifetime and ordered the parts from Newegg and Amazon. The SATA and power cables are making great contact with the drive. I went ahead and replaced the SATA cable with a better quality one, no errors yet, however this error has happened twice and only happens after transferring massive amounts of data over to the server (I haven't transferred any since the change). The log shows tons of read and write errors which makes me think it's the hard drive, although I tested it thoroughly before putting it into the server.

Here is my smart report for the parity drive, everything looks fine to me.

smart.txt

January 25, 201016 yr

Here is my smart report for the parity drive, everything looks fine to me.

Just before your last edit you said that that's the smart report for your sda. That's not the drive giving you the problem.

The problem is with sdf, your parity drive:

Jan 23 07:36:07 Server kernel: md: import disk0: [8,80] (sdf) WDC WD15EADS-00P8B0 WD-WCAVU0381827 offset: 63 size: 1465138552

it's the hard drive, although I tested it thoroughly before putting it into the server.

What tool did you use to test it?

January 25, 201016 yr

Author

Here is my smart report for the parity drive, everything looks fine to me.

Just before your last edit you said that that's the smart report for your sda. That's not the drive giving you the problem.

The problem is with sdf, your parity drive:

Jan 23 07:36:07 Server kernel: md: import disk0: [8,80] (sdf) WDC WD15EADS-00P8B0 WD-WCAVU0381827 offset: 63 size: 1465138552

it's the hard drive, although I tested it thoroughly before putting it into the server.

What tool did you use to test it?

Sorry, I removed my edit because I figured out how to assign what drive to obtain a smart report from. The attached log above is the parity drive (sdf), before the edit it was my disk1 smart report. I also verified the serial numbers to make sure the above log is infact the parity drive.

I used HD Tune Pro (in windows) to test the drive. I wrote zeros to the entire drive, and then ran error checking. In the past this has always found any errors on my hard drives.

January 25, 201016 yr

Sorry, I removed my edit because I figured out how to assign what drive to obtain a smart report from. The attached log above is the parity drive (sdf), before the edit it was my disk1 smart report. I also verified the serial numbers to make sure the above log is infact the parity drive.

I used HD Tune Pro (in windows) to test the drive. I wrote zeros to the entire drive, and then ran error checking. In the past this has always found any errors on my hard drives.

I have had drives get past the tests just fine but still end up dying. The Seagates I have gotten have passed 2-3 rounds of preclear and then crapped out after a month.

We suggesting using the great script created by Joe. L for preclearing the drives. You should be able to find it in the Customization forum or in the unRAID Add-on section.

January 25, 201016 yr

Author

Sorry, I removed my edit because I figured out how to assign what drive to obtain a smart report from. The attached log above is the parity drive (sdf), before the edit it was my disk1 smart report. I also verified the serial numbers to make sure the above log is infact the parity drive.

I used HD Tune Pro (in windows) to test the drive. I wrote zeros to the entire drive, and then ran error checking. In the past this has always found any errors on my hard drives.

I have had drives get past the tests just fine but still end up dying. The Seagates I have gotten have passed 2-3 rounds of preclear and then crapped out after a month.

We suggesting using the great script created by Joe. L for preclearing the drives. You should be able to find it in the Customization forum or in the unRAID Add-on section.

I've downloaded and added the preclear script to my flash drive, i'll do that from now on when adding new drives. Would I get any benefit from transfering all my data off the server and preclearing my existing 6 drives? It's about 4TB, which would take a very long time.

January 25, 201016 yr

Sorry, I removed my edit because I figured out how to assign what drive to obtain a smart report from. The attached log above is the parity drive (sdf), before the edit it was my disk1 smart report. I also verified the serial numbers to make sure the above log is infact the parity drive.

I used HD Tune Pro (in windows) to test the drive. I wrote zeros to the entire drive, and then ran error checking. In the past this has always found any errors on my hard drives.

I have had drives get past the tests just fine but still end up dying. The Seagates I have gotten have passed 2-3 rounds of preclear and then crapped out after a month.

We suggesting using the great script created by Joe. L for preclearing the drives. You should be able to find it in the Customization forum or in the unRAID Add-on section.

I've downloaded and added the preclear script to my flash drive, i'll do that from now on when adding new drives. Would I get any benefit from transfering all my data off the server and preclearing my existing 6 drives?

If it aint broke dont fix it

January 25, 201016 yr

Author

Sorry, I removed my edit because I figured out how to assign what drive to obtain a smart report from. The attached log above is the parity drive (sdf), before the edit it was my disk1 smart report. I also verified the serial numbers to make sure the above log is infact the parity drive.

I used HD Tune Pro (in windows) to test the drive. I wrote zeros to the entire drive, and then ran error checking. In the past this has always found any errors on my hard drives.

I have had drives get past the tests just fine but still end up dying. The Seagates I have gotten have passed 2-3 rounds of preclear and then crapped out after a month.

We suggesting using the great script created by Joe. L for preclearing the drives. You should be able to find it in the Customization forum or in the unRAID Add-on section.

I've downloaded and added the preclear script to my flash drive, i'll do that from now on when adding new drives. Would I get any benefit from transfering all my data off the server and preclearing my existing 6 drives?

If it aint broke dont fix it

So the fact that I didn't preclear the drives couldn't be causing parity errors and the problems in my syslog? I'm about the transfer another 250GB or so to my server, with the new SATA cable attached to the parity drive. The syslog claims the drive had write and read errors, however if that was true they would show in SMART (or at least i'm pretty sure thats how SMART works). So that leaves, SATA cable, bad mobo SATA controller, or bad power cable. I'm using a Corsair 850W high quality PSU so I doubt it's a power cable issue. My mobo is the Supermicro MBD-X7SBE which passed level 1 testing, and many users are using it.

Could it possibly be a setting wrong in the mobo BIOS?

January 25, 201016 yr

So the fact that I didn't preclear the drives couldn't be causing parity errors and the problems in my syslog? I'm about the transfer another 250GB or so to my server, with the new SATA cable attached to the parity drive. The syslog claims the drive had write and read errors, however if that was true they would show in SMART (or at least i'm pretty sure thats how SMART works). So that leaves, SATA cable, bad mobo SATA controller, or bad power cable. I'm using a Corsair 850W high quality PSU so I doubt it's a power cable issue. My mobo is the Supermicro MBD-X7SBE which passed level 1 testing, and many users are using it.

Could it possibly be a setting wrong in the mobo BIOS?

Nope, the preclear will do nothing for you now that the drives are in the array and the fact that you tested/tortured them with another tool makes no difference.

I doubt it is something in the BIOS and I would lean more towards the bad SATA cable or the drive starting to give up the ghost. Running a long and/or short smart test on the drive might help to find any problems with it.

Take a smart reading (or use the one above), then run the short test, grab a report and compare, run the long test (will take a while and make sure to disable disk spin down), and grab another report to compare.

It is a pain in the arse process but it might be your only option

January 25, 201016 yr

Author

So the fact that I didn't preclear the drives couldn't be causing parity errors and the problems in my syslog? I'm about the transfer another 250GB or so to my server, with the new SATA cable attached to the parity drive. The syslog claims the drive had write and read errors, however if that was true they would show in SMART (or at least i'm pretty sure thats how SMART works). So that leaves, SATA cable, bad mobo SATA controller, or bad power cable. I'm using a Corsair 850W high quality PSU so I doubt it's a power cable issue. My mobo is the Supermicro MBD-X7SBE which passed level 1 testing, and many users are using it.

Could it possibly be a setting wrong in the mobo BIOS?

Nope, the preclear will do nothing for you now that the drives are in the array and the fact that you tested/tortured them with another tool makes no difference.

I doubt it is something in the BIOS and I would lean more towards the bad SATA cable or the drive starting to give up the ghost. Running a long and/or short smart test on the drive might help to find any problems with it.

Take a smart reading (or use the one above), then run the short test, grab a report and compare, run the long test (will take a while and make sure to disable disk spin down), and grab another report to compare.

It is a pain in the arse process but it might be your only option

What SATA cables would you recommend buying? My SATA cards have 8 ports on them, so SATA cables with locking mechanisms tend to be to large to fit. I wanted to build the server with high quality SATA cables, but my friend insisted "the red cables are fine".

Also, may I ask why you suggest disabling disk spin down when running SMART tests? I actually just told my friend "It almost seems like these errors are caused by the disk trying to spin down while im transfering data to it". I was under the impression that the disk needed to be completely inactive for X amount of time (I have it set for 15 minutes) before it tries to spin down.

January 25, 201016 yr

So the fact that I didn't preclear the drives couldn't be causing parity errors and the problems in my syslog?

It would have nothing to do with it. It would weed out early failing drives, and re-allocate sectors that can not be written to, but it would not cause "parity" errors.

I'm about the transfer another 250GB or so to my server, with the new SATA cable attached to the parity drive. The syslog claims the drive had write and read errors, however if that was true they would show in SMART (or at least i'm pretty sure thats how SMART works).

If you ran the smart report on the correct drive, and if you knew how to interpret the smart report. (Most people do not. I don't know how to interpert "read" or "write" errors in the report. About all that most of us know how to interpret are reallocated sectors, sectors pending reallocation, and temperature. Other than that, unless a counter is below the manufacturer's threshold, a drive will "report" as passed. (And we've seen many drives that "passed" but were bad)

So that leaves, SATA cable, bad mobo SATA controller, or bad power cable. I'm using a Corsair 850W high quality PSU so I doubt it's a power cable issue.

It does not eliminate the possibility, it is just less. If you've used any power splitters, then every metal-to-metal connection introduces a possible poor contact.

My mobo is the Supermicro MBD-X7SBE which passed level 1 testing, and many users are using it.
Could it possibly be a setting wrong in the mobo BIOS?

unlikely to be a setting, but could easily be ANY hardware. Many people change SATA cables...but I personally have read of more power cable problems on these forums. Poor quality cables usually show as CRC errors, not disk freezes. That to me is a loose cable, or intermittent on either the drive or disk controller, or a power cable. If your case has backplanes, and the disks plug into it, it is also suspect.

The first error I saw in your syslog was here:

Jan 24 11:43:37 Server kernel: ata6.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
Jan 24 11:43:37 Server kernel: ata6.00: cmd 25/00:00:1f:64:33/00:04:5e:00:00/e0 tag 0 dma 524288 in
Jan 24 11:43:37 Server kernel:          res 40/00:ff:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Jan 24 11:43:37 Server kernel: ata6.00: status: { DRDY }
Jan 24 11:43:37 Server kernel: ata6: hard resetting link
Jan 24 11:43:47 Server kernel: ata6: softreset failed (device not ready)
Jan 24 11:43:47 Server kernel: ata6: hard resetting link
Jan 24 11:43:57 Server kernel: ata6: softreset failed (device not ready)
Jan 24 11:43:57 Server kernel: ata6: hard resetting link
Jan 24 11:44:07 Server kernel: ata6: link is slow to respond, please be patient (ready=0)
Jan 24 11:44:32 Server kernel: ata6: softreset failed (device not ready)
Jan 24 11:44:32 Server kernel: ata6: limiting SATA link speed to 1.5 Gbps
Jan 24 11:44:32 Server kernel: ata6: hard resetting link
Jan 24 11:44:37 Server kernel: ata6: softreset failed (device not ready)
Jan 24 11:44:37 Server kernel: ata6: reset failed, giving up
Jan 24 11:44:37 Server kernel: ata6.00: disabled
Jan 24 11:44:37 Server kernel: ata6.00: device reported invalid CHS sector 0
Jan 24 11:44:37 Server kernel: ata6: EH complete
Jan 24 11:44:37 Server kernel: sd 6:0:0:0: [sdf] Unhandled error code
Jan 24 11:44:37 Server kernel: sd 6:0:0:0: [sdf] Result: hostbyte=0x04 driverbyte=0x00
Jan 24 11:44:37 Server kernel: end_request: I/O error, dev sdf, sector 1580426271
Jan 24 11:44:37 Server kernel: md: disk0 read error
Jan 24 11:44:37 Server kernel: handle_stripe read error: 1580426208/0, count: 1
Jan 24 11:44:37 Server kernel: md: disk0 read error
Jan 24 11:44:37 Server kernel: handle_stripe read error: 1580426216/0, count: 1
Jan 24 11:44:37 Server kernel: md: disk0 read error
Jan 24 11:44:37 Server kernel: handle_stripe read error: 1580426224/0, count: 1
Jan 24 11:44:37 Server kernel: md: disk0 read error
Jan 24 11:44:37 Server kernel: handle_stripe read error: 1580426232/0, count: 1
Jan 24 11:44:37 Server kernel: md: disk0 read error
Jan 24 11:44:37 Server kernel: handle_stripe read error: 1580426240/0, count: 1
Jan 24 11:44:37 Server kernel: md: disk0 read error
Jan 24 11:44:37 Server kernel: handle_stripe read error: 1580426248/0, count: 1
Jan 24 11:44:37 Server kernel: md: disk0 read error
Jan 24 11:44:37 Server kernel: handle_stripe read error: 1580426256/0, count: 1
Jan 24 11:44:37 Server kernel: md: disk0 read error
Jan 24 11:44:37 Server kernel: handle_stripe read error: 1580426264/0, count: 1
Jan 24 11:44:37 Server kernel: md: disk0 read error

Attempts to send commands to the disk fail. the kernel tries to reset it, it cannot. It then disables it. All subsequent "reads" and "writes" fail.

It is your parity disk. (currently) /dev/sdf

Please run:

smartctl -a -d ata /dev/sdf

and report back with its output.

Then, turn off disk spin-down and run

smartctl -d ata -t long /dev/sdf

let it run for about 5 hours (Your prior SMART report says the expected time to complete an "extended offline test" is 255 minutes) After that interval again type

smartctl -a -d ata /dev/sdf

and report back with its output.

After those tests you can switch the parity disk to a different port on the disk controller. If you do, you'll need to press "restore" to save a new disk configuration. (It is actually a "set disk configuration" button) If the errors move to the other disk, then the controller is at fault.

Remember to power down for ANY changes in cabling, or any changes to disk drives in server slots. unRAID is NOT hot-pluggable.

Run a memory test for at least several cycles... and at least double check the memory voltage, timings, and clock speed. It is not likely to be the issue, but it would certainly complicate your trouble-shooting efforts if it was contributing errors too.

Lastly, In your frustration on your first post you said you felt it was unlikely a single drive could cause so many errors. Well, yes, when each sector it attempts to write fails, and each failure results in several lines in the system log, and the process of calculating parity is going to write every one of the 1,500,301,910,016 bytes of the drive. That is a LOT of 512 byte sectors, and potentially a LOT of errors.

Joe L.

January 25, 201016 yr

Author

So the fact that I didn't preclear the drives couldn't be causing parity errors and the problems in my syslog?
It would have nothing to do with it. It would weed out early failing drives, and re-allocate sectors that can not be written to, but it would not cause "parity" errors.
I'm about the transfer another 250GB or so to my server, with the new SATA cable attached to the parity drive. The syslog claims the drive had write and read errors, however if that was true they would show in SMART (or at least i'm pretty sure thats how SMART works).

If you ran the smart report on the correct drive, and if you knew how to interpret the smart report. (Most people do not. I don't know how to interpert "read" or "write" errors in the report. About all that most of us know how to interpret are reallocated sectors, sectors pending reallocation, and temperature. Other than that, unless a counter is below the manufacturer's threshold, a drive will "report" as passed. (And we've seen many drives that "passed" but were bad)

So that leaves, SATA cable, bad mobo SATA controller, or bad power cable. I'm using a Corsair 850W high quality PSU so I doubt it's a power cable issue.
It does not eliminate the possibility, it is just less. If you've used any power splitters, then every metal-to-metal connection introduces a possible poor contact.
My mobo is the Supermicro MBD-X7SBE which passed level 1 testing, and many users are using it.
Could it possibly be a setting wrong in the mobo BIOS?

unlikely to be a setting, but could easily be ANY hardware. Many people change SATA cables...but I personally have read of more power cable problems on these forums. Poor quality cables usually show as CRC errors, not disk freezes. That to me is a loose cable, or intermittent on either the drive or disk controller, or a power cable. If your case has backplanes, and the disks plug into it, it is also suspect.

The first error I saw in your syslog was here:
Jan 24 11:43:37 Server kernel: ata6.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
Jan 24 11:43:37 Server kernel: ata6.00: cmd 25/00:00:1f:64:33/00:04:5e:00:00/e0 tag 0 dma 524288 in
Jan 24 11:43:37 Server kernel:          res 40/00:ff:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Jan 24 11:43:37 Server kernel: ata6.00: status: { DRDY }
Jan 24 11:43:37 Server kernel: ata6: hard resetting link
Jan 24 11:43:47 Server kernel: ata6: softreset failed (device not ready)
Jan 24 11:43:47 Server kernel: ata6: hard resetting link
Jan 24 11:43:57 Server kernel: ata6: softreset failed (device not ready)
Jan 24 11:43:57 Server kernel: ata6: hard resetting link
Jan 24 11:44:07 Server kernel: ata6: link is slow to respond, please be patient (ready=0)
Jan 24 11:44:32 Server kernel: ata6: softreset failed (device not ready)
Jan 24 11:44:32 Server kernel: ata6: limiting SATA link speed to 1.5 Gbps
Jan 24 11:44:32 Server kernel: ata6: hard resetting link
Jan 24 11:44:37 Server kernel: ata6: softreset failed (device not ready)
Jan 24 11:44:37 Server kernel: ata6: reset failed, giving up
Jan 24 11:44:37 Server kernel: ata6.00: disabled
Jan 24 11:44:37 Server kernel: ata6.00: device reported invalid CHS sector 0
Jan 24 11:44:37 Server kernel: ata6: EH complete
Jan 24 11:44:37 Server kernel: sd 6:0:0:0: [sdf] Unhandled error code
Jan 24 11:44:37 Server kernel: sd 6:0:0:0: [sdf] Result: hostbyte=0x04 driverbyte=0x00
Jan 24 11:44:37 Server kernel: end_request: I/O error, dev sdf, sector 1580426271
Jan 24 11:44:37 Server kernel: md: disk0 read error
Jan 24 11:44:37 Server kernel: handle_stripe read error: 1580426208/0, count: 1
Jan 24 11:44:37 Server kernel: md: disk0 read error
Jan 24 11:44:37 Server kernel: handle_stripe read error: 1580426216/0, count: 1
Jan 24 11:44:37 Server kernel: md: disk0 read error
Jan 24 11:44:37 Server kernel: handle_stripe read error: 1580426224/0, count: 1
Jan 24 11:44:37 Server kernel: md: disk0 read error
Jan 24 11:44:37 Server kernel: handle_stripe read error: 1580426232/0, count: 1
Jan 24 11:44:37 Server kernel: md: disk0 read error
Jan 24 11:44:37 Server kernel: handle_stripe read error: 1580426240/0, count: 1
Jan 24 11:44:37 Server kernel: md: disk0 read error
Jan 24 11:44:37 Server kernel: handle_stripe read error: 1580426248/0, count: 1
Jan 24 11:44:37 Server kernel: md: disk0 read error
Jan 24 11:44:37 Server kernel: handle_stripe read error: 1580426256/0, count: 1
Jan 24 11:44:37 Server kernel: md: disk0 read error
Jan 24 11:44:37 Server kernel: handle_stripe read error: 1580426264/0, count: 1
Jan 24 11:44:37 Server kernel: md: disk0 read error
Attempts to send commands to the disk fail. the kernel tries to reset it, it cannot. It then disables it. All subsequent "reads" and "writes" fail.

It is your parity disk. (currently) /dev/sdf

Please run:

smartctl -a -d ata /dev/sdf

and report back with its output.

Then, turn off disk spin-down and run

smartctl -d ata -t long /dev/sdf

let it run for about 5 hours (Your prior SMART report says the expected time to complete an "extended offline test" is 255 minutes) After that interval again type

smartctl -a -d ata /dev/sdf

and report back with its output.

After those tests you can switch the parity disk to a different port on the disk controller. If you do, you'll need to press "restore" to save a new disk configuration. (It is actually a "set disk configuration" button) If the errors move to the other disk, then the controller is at fault.

Remember to power down for ANY changes in cabling, or any changes to disk drives in server slots. unRAID is NOT hot-pluggable.

Run a memory test for at least several cycles... and at least double check the memory voltage, timings, and clock speed. It is not likely to be the issue, but it would certainly complicate your trouble-shooting efforts if it was contributing errors too.

Lastly, In your frustration on your first post you said you felt it was unlikely a single drive could cause so many errors. Well, yes, when each sector it attempts to write fails, and each failure results in several lines in the system log, and the process of calculating parity is going to write every one of the 1,500,301,910,016 bytes of the drive. That is a LOT of 512 byte sectors, and potentially a LOT of errors.

Joe L.

I'll run the SMART tests and report back.

I ran Memtest for 12 hours and it had 0 errors, I also ran LinX (popular overclocking stability testing program) for 6 hours to stress test the system as a whole before using unRAID.

January 25, 201016 yr

Also, may I ask why you suggest disabling disk spin down when running SMART tests? I actually just told my friend "It almost seems like these errors are caused by the disk trying to spin down while im transfering data to it". I was under the impression that the disk needed to be completely inactive for X amount of time (I have it set for 15 minutes) before it tries to spin down.

Easy. it is unRAID monitoring activity to the "md" driver that will spin a drive down when inactive for the set period.

When running a long smart test, a test that runs internal to the disk drive only, there is NO I/O on the SATA cables, and no I/O through the "md" devices" and unRAID will think they have been idle and spin them down after their time-out period your configured.

The command to spin down a drive will abort a "long" test. Other unRAID users discovered that to be true on their disks... (Really hard to continue a 5 hour long test reading the entire disk when the disk is told to spin down)

Now, you might have disks with firmware that continues the "long" test...and ignores spin-down commands, but do you want to waste a few hours testing if they do, or .... would you possibly just accept the advice as being reasonably sound, and not complicate your life right now with a test that might not run to completion because you instructed it to abort 15 minutes into a 5 hour test.

Joe L.

January 26, 201016 yr

Author

Here my before and after smart logs

# 1 Extended offline Completed without error 00% 142 -

Ignore the 3 aborted tests, that was me. I managed to transfer around 500GB with no parity errors after swapping the SATA cable, so i'm hoping that was it.

beforetest.txt

aftertest.txt

January 26, 201016 yr

Here my before and after smart logs

# 1 Extended offline Completed without error 00% 142 -

Ignore the 3 aborted tests, that was me. I managed to transfer around 500GB with no parity errors after swapping the SATA cable, so i'm hoping that was it.

I hope it was it too... That drive looks healthy to me.

January 26, 201016 yr

Author

Here my before and after smart logs

# 1 Extended offline Completed without error 00% 142 -

Ignore the 3 aborted tests, that was me. I managed to transfer around 500GB with no parity errors after swapping the SATA cable, so i'm hoping that was it.

I hope it was it too... That drive looks healthy to me.

Now I need to replace all my sata cables with quality ones, however I can't really find any that fit on this.

The sata ports are really close together. The top one leaves like no room for any quality cable, and all quality cables i've seen have clips that require more room

January 26, 201016 yr

Here my before and after smart logs

# 1 Extended offline Completed without error 00% 142 -

Ignore the 3 aborted tests, that was me. I managed to transfer around 500GB with no parity errors after swapping the SATA cable, so i'm hoping that was it.

I hope it was it too... That drive looks healthy to me.

Now I need to replace all my sata cables with quality ones, however I can't really find any that fit on this.

The sata ports are really close together. The top one leaves like no room for any quality cable, and all quality cables i've seen have clips that require more room

I'm not sure if your existing cables need replacing. In your first post in this thread you said the parity checks always have 0 errors. Did you see CRC errors in the syslog when doing a parity check that leads you to suspect the cables? Or just the disk freeze as in your attached syslog from Jan 23rd.

The only other issue I can think of is an interrupt conflict between your disk controller and your network interface. That might result in errors when transferring data that do not occur when doing a parity check.

Can you post the output of

cat /proc/interrupts

Joe L.

January 26, 201016 yr

Author

Here my before and after smart logs

# 1 Extended offline Completed without error 00% 142 -

Ignore the 3 aborted tests, that was me. I managed to transfer around 500GB with no parity errors after swapping the SATA cable, so i'm hoping that was it.

I hope it was it too... That drive looks healthy to me.

Now I need to replace all my sata cables with quality ones, however I can't really find any that fit on this.

The sata ports are really close together. The top one leaves like no room for any quality cable, and all quality cables i've seen have clips that require more room

I'm not sure if your existing cables need replacing. In your first post in this thread you said the parity checks always have 0 errors. Did you see CRC errors in the syslog when doing a parity check that leads you to suspect the cables? Or just the disk freeze as in your attached syslog from Jan 23rd.

The only other issue I can think of is an interrupt conflict between your disk controller and your network interface. That might result in errors when transferring data that do not occur when doing a parity check.

Can you post the output of

cat /proc/interrupts

Joe L.

The only thing i've seen is the disk errors, but the disk seems fine. No CRC errors. Heres that log...

interrupts.txt

Parity errors again (this time with syslog)

Featured Replies

Archived

Account

Navigation

Search

Configure browser push notifications

Chrome (Android)

Chrome (Desktop)

Safari (iOS 16.4+)

Safari (macOS)

Edge (Android)

Edge (Desktop)

Firefox (Android)

Firefox (Desktop)