Jump to content

Parity Disk Errors - Please help


bkastner

Recommended Posts

Hi

 

I've posted a few times about issues with my WD 6TB Red parity disk and the fact that I was having errors show up on my parity disk when a parity check was run. It was suggested that I run in Xen mode as this seemed to have better success, so I've been running that way for the last couple of weeks, and I thought all was good, but have found out otherwise.

 

I am getting errors in the GUI on my parity drive, and if I look through my syslog I can see a bunch of errors again.

 

I am going on vacation with my family Saturday morning and will not have access to my system at all (I am going to be on a cruise so don't want to be troubleshooting from the middle of the ocean). :)

 

Can someone please review my logs and advise if there is anything I should be doing (short of shutting off the system until I get back). Everything seems to be running smoothly (outside of the errors), but I don't want an issue arising while I am travelling.

 

Any comments/suggestions would be appreciated.

syslog.zip

Link to comment

Looks to be bad spots on the disk or some kind of driver incompatibility.

 

turn off spin down.

stop the array.

 

capture(save) a smart report with smartctl -a

Post it for review.

 

 

trigger a smart long test with smartctl -t long

 

wait the specified number of minutes.

the smartctl -t long will come back with an ETA of when the test will have completed.

 

It will feel like eternity, enjoy life.

 

capture(save) a smart report with smartctl -a

Post again for review.

 

Keep in mind that it may take a good amount of power to spin up the 6tb drives.

So it may be possible the PSU is being taxed differently now too.

Nov  7 03:40:01 CydStorage logger: ./TV/White Collar/Season 3/White Collar S03E05 - Veiled Threat.mkv
Nov  7 03:40:01 CydStorage logger: .d..t...... ./
Nov  7 03:40:02 CydStorage logger: .d..t...... TV/
Nov  7 03:40:02 CydStorage logger: .d..t...... TV/White Collar/
Nov  7 03:40:02 CydStorage logger: cd+++++++++ TV/White Collar/Season 3/
Nov  7 03:40:02 CydStorage logger: >f+++++++++ TV/White Collar/Season 3/White Collar S03E05 - Veiled Threat.mkv
Nov  7 03:40:15 CydStorage kernel: drivers/scsi/mvsas/mv_sas.c 1963:Release slot [0] tag[0], task [ffff88010e7ae780]:
Nov  7 03:40:15 CydStorage kernel: drivers/scsi/mvsas/mv_94xx.c 625:command active FFFFFFFD,  slot [0].
Nov  7 03:40:15 CydStorage kernel: sas: sas_ata_task_done: SAS error 8a
Nov  7 03:40:15 CydStorage kernel: drivers/scsi/mvsas/mv_sas.c 1963:Release slot [2] tag[2], task [ffff8803ebb58500]:
Nov  7 03:40:15 CydStorage kernel: drivers/scsi/mvsas/mv_94xx.c 625:command active FFFFFFFC,  slot [2].
Nov  7 03:40:15 CydStorage kernel: sas: sas_ata_task_done: SAS error 8a
Nov  7 03:40:15 CydStorage kernel: drivers/scsi/mvsas/mv_sas.c 1963:Release slot [3] tag[3], task [ffff8803ebb59b80]:
Nov  7 03:40:15 CydStorage kernel: drivers/scsi/mvsas/mv_94xx.c 625:command active FFFFFFF8,  slot [3].
Nov  7 03:40:15 CydStorage kernel: sas: sas_ata_task_done: SAS error 8a
Nov  7 03:40:15 CydStorage kernel: drivers/scsi/mvsas/mv_sas.c 1963:Release slot [4] tag[4], task [ffff8803ebb59540]:
Nov  7 03:40:15 CydStorage kernel: drivers/scsi/mvsas/mv_94xx.c 625:command active FFFFFFF0,  slot [4].
Nov  7 03:40:15 CydStorage kernel: sas: sas_ata_task_done: SAS error 8a
Nov  7 03:40:15 CydStorage kernel: drivers/scsi/mvsas/mv_sas.c 1963:Release slot [5] tag[5], task [ffff8803ebb58b40]:
Nov  7 03:40:15 CydStorage kernel: drivers/scsi/mvsas/mv_94xx.c 625:command active FFFFFFE0,  slot [5].
Nov  7 03:40:15 CydStorage kernel: sas: sas_ata_task_done: SAS error 8a
Nov  7 03:40:15 CydStorage kernel: drivers/scsi/mvsas/mv_sas.c 1963:Release slot [6] tag[6], task [ffff8803ebb59e00]:
Nov  7 03:40:15 CydStorage kernel: drivers/scsi/mvsas/mv_94xx.c 625:command active FFFFFFC0,  slot [6].
Nov  7 03:40:15 CydStorage kernel: sas: sas_ata_task_done: SAS error 8a
...
Nov  7 03:40:16 CydStorage kernel: ata8.00: failed command: READ FPDMA QUEUED
Nov  7 03:40:16 CydStorage kernel: ata8.00: cmd 60/00:00:e8:e6:3c/04:00:2a:01:00/40 tag 30 ncq 524288 in
Nov  7 03:40:16 CydStorage kernel:         res 01/04:00:00:00:00/00:00:00:00:00/40 Emask 0x2 (HSM violation)
Nov  7 03:40:16 CydStorage kernel: ata8.00: status: { ERR }
Nov  7 03:40:16 CydStorage kernel: ata8.00: error: { ABRT }
Nov  7 03:40:16 CydStorage kernel: ata8: hard resetting link
Nov  7 03:40:16 CydStorage kernel: ata8.00: configured for UDMA/133
Nov  7 03:40:16 CydStorage kernel: sd 1:0:1:0: [sdd] Unhandled sense code
Nov  7 03:40:16 CydStorage kernel: sd 1:0:1:0: [sdd]  
Nov  7 03:40:16 CydStorage kernel: Result: hostbyte=0x00 driverbyte=0x08
Nov  7 03:40:16 CydStorage kernel: sd 1:0:1:0: [sdd]  
Nov  7 03:40:16 CydStorage kernel: Sense Key : 0x3 [current] [descriptor]
Nov  7 03:40:16 CydStorage kernel: Descriptor sense data with sense descriptors (in hex):
Nov  7 03:40:16 CydStorage kernel:        72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 01 
Nov  7 03:40:16 CydStorage kernel:        2a 3c 7e e0 
Nov  7 03:40:16 CydStorage kernel: sd 1:0:1:0: [sdd]  
Nov  7 03:40:16 CydStorage kernel: ASC=0x11 ASCQ=0x4
Nov  7 03:40:16 CydStorage kernel: sd 1:0:1:0: [sdd] CDB: 
Nov  7 03:40:16 CydStorage kernel: cdb[0]=0x88: 88 00 00 00 00 01 2a 3c 7e e0 00 00 04 00 00 00
Nov  7 03:40:16 CydStorage kernel: end_request: I/O error, dev sdd, sector 5003575008
Nov  7 03:40:16 CydStorage kernel: md: disk0 read error, sector=5003574944
Nov  7 03:40:16 CydStorage kernel: ata8: EH complete

Line 1230: Nov  5 11:29:09 CydStorage kernel: md: unRAID driver 2.3.1 installed
   Line 1235: Nov  5 11:29:09 CydStorage kernel: md: import disk0: [8,48] (sdd) WDC_WD60EFRX-68MYMN1_WD-WX51H3421101 size: 5860522532
   Line 1237: Nov  5 11:29:09 CydStorage kernel: md: import disk1: [8,112] (sdh) WDC_WD30EZRX-00DC0B0_WD-WMC1T1733337 size: 2930266532
   Line 1239: Nov  5 11:29:09 CydStorage kernel: md: import disk2: [8,224] (sdo) WDC_WD30EZRX-00DC0B0_WD-WMC1T0093370 size: 2930266532
   Line 1241: Nov  5 11:29:09 CydStorage kernel: md: import disk3: [8,208] (sdn) WDC_WD30EZRX-00DC0B0_WD-WMC1T0383687 size: 2930266532
   Line 1243: Nov  5 11:29:09 CydStorage kernel: md: import disk4: [8,192] (sdm) WDC_WD30EZRX-00MMMB0_WD-WCAWZ2445238 size: 2930266532
   Line 1245: Nov  5 11:29:09 CydStorage kernel: md: import disk5: [8,240] (sdp) WDC_WD30EZRX-00DC0B0_WD-WMC1T3028282 size: 2930266532
   Line 1247: Nov  5 11:29:09 CydStorage kernel: md: import disk6: [8,96] (sdg) WDC_WD30EZRX-00MMMB0_WD-WCAWZ2385454 size: 2930266532
   Line 1249: Nov  5 11:29:09 CydStorage kernel: md: import disk7: [8,80] (sdf) WDC_WD30EZRX-00DC0B0_WD-WMC1T1356713 size: 2930266532
   Line 1251: Nov  5 11:29:09 CydStorage kernel: md: import disk8: [8,160] (sdk) WDC_WD40EZRX-00SPEB0_WD-WCC4E0503635 size: 3906985768
   Line 1253: Nov  5 11:29:09 CydStorage kernel: md: import disk9: [8,176] (sdl) WDC_WD40EZRX-00SPEB0_WD-WCC4E0893895 size: 3907018532
   Line 1255: Nov  5 11:29:09 CydStorage kernel: md: import disk10: [8,144] (sdj) WDC_WD40EZRX-00SPEB0_WD-WCC4E0425712 size: 3906985768
   Line 1257: Nov  5 11:29:09 CydStorage kernel: md: import disk11: [8,32] (sdc) WDC_WD40EFRX-68WT0N0_WD-WCC4ELZ726FT size: 3907018532
   Line 1259: Nov  5 11:29:09 CydStorage kernel: md: import disk12: [8,128] (sdi) WDC_WD40EFRX-68WT0N0_WD-WCC4E1471407 size: 3907018532
   Line 1607: Nov  5 11:31:33 CydStorage kernel: md: recovery thread woken up ...
   Line 1608: Nov  5 11:31:33 CydStorage kernel: md: recovery thread checking parity...
   Line 1609: Nov  5 11:31:33 CydStorage kernel: md: using 9344k window, over a total of 5860522532 blocks.
   Line 1616: Nov  5 14:12:28 CydStorage kernel: md: correcting parity, sector=2240016032
   Line 2229: Nov  6 05:11:35 CydStorage kernel: md: sync done. time=63602sec
   Line 2230: Nov  6 05:11:35 CydStorage kernel: md: recovery thread sync completion status: 0

   Line 2748: Nov  7 03:40:16 CydStorage kernel: md: disk0 read error, sector=5003574944
   Line 2750: Nov  7 03:40:16 CydStorage kernel: md: disk0 read error, sector=5003574952
   Line 2751: Nov  7 03:40:16 CydStorage kernel: md: disk0 read error, sector=5003574960
   Line 2752: Nov  7 03:40:16 CydStorage kernel: md: disk0 read error, sector=5003574968
...tons of sector errors deleted...
   Line 2873: Nov  7 03:40:16 CydStorage kernel: md: disk0 read error, sector=5003575928
   Line 2874: Nov  7 03:40:16 CydStorage kernel: md: disk0 read error, sector=5003575936
   Line 2875: Nov  7 03:40:16 CydStorage kernel: md: disk0 read error, sector=5003575944
   Line 2876: Nov  7 03:40:16 CydStorage kernel: md: disk0 read error, sector=5003575952
   Line 2877: Nov  7 03:40:16 CydStorage kernel: md: disk0 read error, sector=5003575960

   Line 5543: Nov 11 12:38:35 CydStorage kernel: md: disk0 read error, sector=200306560

Link to comment

Looks to be bad spots on the disk or some kind of driver incompatibility.

 

turn off spin down.

stop the array.

 

capture(save) a smart report with smartctl -a

Post it for review.

 

 

trigger a smart long test with smartctl -t long

 

wait the specified number of minutes.

the smartctl -t long will come back with an ETA of when the test will have completed.

 

It will feel like eternity, enjoy life.

 

capture(save) a smart report with smartctl -a

Post again for review.

 

Keep in mind that it may take a good amount of power to spin up the 6tb drives.

So it may be possible the PSU is being taxed differently now too.

Nov  7 03:40:01 CydStorage logger: ./TV/White Collar/Season 3/White Collar S03E05 - Veiled Threat.mkv
Nov  7 03:40:01 CydStorage logger: .d..t...... ./
Nov  7 03:40:02 CydStorage logger: .d..t...... TV/
Nov  7 03:40:02 CydStorage logger: .d..t...... TV/White Collar/
Nov  7 03:40:02 CydStorage logger: cd+++++++++ TV/White Collar/Season 3/
Nov  7 03:40:02 CydStorage logger: >f+++++++++ TV/White Collar/Season 3/White Collar S03E05 - Veiled Threat.mkv
Nov  7 03:40:15 CydStorage kernel: drivers/scsi/mvsas/mv_sas.c 1963:Release slot [0] tag[0], task [ffff88010e7ae780]:
Nov  7 03:40:15 CydStorage kernel: drivers/scsi/mvsas/mv_94xx.c 625:command active FFFFFFFD,  slot [0].
Nov  7 03:40:15 CydStorage kernel: sas: sas_ata_task_done: SAS error 8a
Nov  7 03:40:15 CydStorage kernel: drivers/scsi/mvsas/mv_sas.c 1963:Release slot [2] tag[2], task [ffff8803ebb58500]:
Nov  7 03:40:15 CydStorage kernel: drivers/scsi/mvsas/mv_94xx.c 625:command active FFFFFFFC,  slot [2].
Nov  7 03:40:15 CydStorage kernel: sas: sas_ata_task_done: SAS error 8a
Nov  7 03:40:15 CydStorage kernel: drivers/scsi/mvsas/mv_sas.c 1963:Release slot [3] tag[3], task [ffff8803ebb59b80]:
Nov  7 03:40:15 CydStorage kernel: drivers/scsi/mvsas/mv_94xx.c 625:command active FFFFFFF8,  slot [3].
Nov  7 03:40:15 CydStorage kernel: sas: sas_ata_task_done: SAS error 8a
Nov  7 03:40:15 CydStorage kernel: drivers/scsi/mvsas/mv_sas.c 1963:Release slot [4] tag[4], task [ffff8803ebb59540]:
Nov  7 03:40:15 CydStorage kernel: drivers/scsi/mvsas/mv_94xx.c 625:command active FFFFFFF0,  slot [4].
Nov  7 03:40:15 CydStorage kernel: sas: sas_ata_task_done: SAS error 8a
Nov  7 03:40:15 CydStorage kernel: drivers/scsi/mvsas/mv_sas.c 1963:Release slot [5] tag[5], task [ffff8803ebb58b40]:
Nov  7 03:40:15 CydStorage kernel: drivers/scsi/mvsas/mv_94xx.c 625:command active FFFFFFE0,  slot [5].
Nov  7 03:40:15 CydStorage kernel: sas: sas_ata_task_done: SAS error 8a
Nov  7 03:40:15 CydStorage kernel: drivers/scsi/mvsas/mv_sas.c 1963:Release slot [6] tag[6], task [ffff8803ebb59e00]:
Nov  7 03:40:15 CydStorage kernel: drivers/scsi/mvsas/mv_94xx.c 625:command active FFFFFFC0,  slot [6].
Nov  7 03:40:15 CydStorage kernel: sas: sas_ata_task_done: SAS error 8a
...
Nov  7 03:40:16 CydStorage kernel: ata8.00: failed command: READ FPDMA QUEUED
Nov  7 03:40:16 CydStorage kernel: ata8.00: cmd 60/00:00:e8:e6:3c/04:00:2a:01:00/40 tag 30 ncq 524288 in
Nov  7 03:40:16 CydStorage kernel:         res 01/04:00:00:00:00/00:00:00:00:00/40 Emask 0x2 (HSM violation)
Nov  7 03:40:16 CydStorage kernel: ata8.00: status: { ERR }
Nov  7 03:40:16 CydStorage kernel: ata8.00: error: { ABRT }
Nov  7 03:40:16 CydStorage kernel: ata8: hard resetting link
Nov  7 03:40:16 CydStorage kernel: ata8.00: configured for UDMA/133
Nov  7 03:40:16 CydStorage kernel: sd 1:0:1:0: [sdd] Unhandled sense code
Nov  7 03:40:16 CydStorage kernel: sd 1:0:1:0: [sdd]  
Nov  7 03:40:16 CydStorage kernel: Result: hostbyte=0x00 driverbyte=0x08
Nov  7 03:40:16 CydStorage kernel: sd 1:0:1:0: [sdd]  
Nov  7 03:40:16 CydStorage kernel: Sense Key : 0x3 [current] [descriptor]
Nov  7 03:40:16 CydStorage kernel: Descriptor sense data with sense descriptors (in hex):
Nov  7 03:40:16 CydStorage kernel:        72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 01 
Nov  7 03:40:16 CydStorage kernel:        2a 3c 7e e0 
Nov  7 03:40:16 CydStorage kernel: sd 1:0:1:0: [sdd]  
Nov  7 03:40:16 CydStorage kernel: ASC=0x11 ASCQ=0x4
Nov  7 03:40:16 CydStorage kernel: sd 1:0:1:0: [sdd] CDB: 
Nov  7 03:40:16 CydStorage kernel: cdb[0]=0x88: 88 00 00 00 00 01 2a 3c 7e e0 00 00 04 00 00 00
Nov  7 03:40:16 CydStorage kernel: end_request: I/O error, dev sdd, sector 5003575008
Nov  7 03:40:16 CydStorage kernel: md: disk0 read error, sector=5003574944
Nov  7 03:40:16 CydStorage kernel: ata8: EH complete

Line 1230: Nov  5 11:29:09 CydStorage kernel: md: unRAID driver 2.3.1 installed
   Line 1235: Nov  5 11:29:09 CydStorage kernel: md: import disk0: [8,48] (sdd) WDC_WD60EFRX-68MYMN1_WD-WX51H3421101 size: 5860522532
   Line 1237: Nov  5 11:29:09 CydStorage kernel: md: import disk1: [8,112] (sdh) WDC_WD30EZRX-00DC0B0_WD-WMC1T1733337 size: 2930266532
   Line 1239: Nov  5 11:29:09 CydStorage kernel: md: import disk2: [8,224] (sdo) WDC_WD30EZRX-00DC0B0_WD-WMC1T0093370 size: 2930266532
   Line 1241: Nov  5 11:29:09 CydStorage kernel: md: import disk3: [8,208] (sdn) WDC_WD30EZRX-00DC0B0_WD-WMC1T0383687 size: 2930266532
   Line 1243: Nov  5 11:29:09 CydStorage kernel: md: import disk4: [8,192] (sdm) WDC_WD30EZRX-00MMMB0_WD-WCAWZ2445238 size: 2930266532
   Line 1245: Nov  5 11:29:09 CydStorage kernel: md: import disk5: [8,240] (sdp) WDC_WD30EZRX-00DC0B0_WD-WMC1T3028282 size: 2930266532
   Line 1247: Nov  5 11:29:09 CydStorage kernel: md: import disk6: [8,96] (sdg) WDC_WD30EZRX-00MMMB0_WD-WCAWZ2385454 size: 2930266532
   Line 1249: Nov  5 11:29:09 CydStorage kernel: md: import disk7: [8,80] (sdf) WDC_WD30EZRX-00DC0B0_WD-WMC1T1356713 size: 2930266532
   Line 1251: Nov  5 11:29:09 CydStorage kernel: md: import disk8: [8,160] (sdk) WDC_WD40EZRX-00SPEB0_WD-WCC4E0503635 size: 3906985768
   Line 1253: Nov  5 11:29:09 CydStorage kernel: md: import disk9: [8,176] (sdl) WDC_WD40EZRX-00SPEB0_WD-WCC4E0893895 size: 3907018532
   Line 1255: Nov  5 11:29:09 CydStorage kernel: md: import disk10: [8,144] (sdj) WDC_WD40EZRX-00SPEB0_WD-WCC4E0425712 size: 3906985768
   Line 1257: Nov  5 11:29:09 CydStorage kernel: md: import disk11: [8,32] (sdc) WDC_WD40EFRX-68WT0N0_WD-WCC4ELZ726FT size: 3907018532
   Line 1259: Nov  5 11:29:09 CydStorage kernel: md: import disk12: [8,128] (sdi) WDC_WD40EFRX-68WT0N0_WD-WCC4E1471407 size: 3907018532
   Line 1607: Nov  5 11:31:33 CydStorage kernel: md: recovery thread woken up ...
   Line 1608: Nov  5 11:31:33 CydStorage kernel: md: recovery thread checking parity...
   Line 1609: Nov  5 11:31:33 CydStorage kernel: md: using 9344k window, over a total of 5860522532 blocks.
   Line 1616: Nov  5 14:12:28 CydStorage kernel: md: correcting parity, sector=2240016032
   Line 2229: Nov  6 05:11:35 CydStorage kernel: md: sync done. time=63602sec
   Line 2230: Nov  6 05:11:35 CydStorage kernel: md: recovery thread sync completion status: 0

   Line 2748: Nov  7 03:40:16 CydStorage kernel: md: disk0 read error, sector=5003574944
   Line 2750: Nov  7 03:40:16 CydStorage kernel: md: disk0 read error, sector=5003574952
   Line 2751: Nov  7 03:40:16 CydStorage kernel: md: disk0 read error, sector=5003574960
   Line 2752: Nov  7 03:40:16 CydStorage kernel: md: disk0 read error, sector=5003574968
...tons of sector errors deleted...
   Line 2873: Nov  7 03:40:16 CydStorage kernel: md: disk0 read error, sector=5003575928
   Line 2874: Nov  7 03:40:16 CydStorage kernel: md: disk0 read error, sector=5003575936
   Line 2875: Nov  7 03:40:16 CydStorage kernel: md: disk0 read error, sector=5003575944
   Line 2876: Nov  7 03:40:16 CydStorage kernel: md: disk0 read error, sector=5003575952
   Line 2877: Nov  7 03:40:16 CydStorage kernel: md: disk0 read error, sector=5003575960

   Line 5543: Nov 11 12:38:35 CydStorage kernel: md: disk0 read error, sector=200306560

 

Thanks Weebo.

 

I have a Corsair AX860 PSU, so I would hope I am not taxing that. I went overboard on the PSU to ensure it was not the source of issues (as I've had that in the past).

 

I will try the smart tests and post.

Link to comment

Here is the smartctl -a output

 

I am running the long test overnight and will post results in the AM:

 

smartctl 6.2 2013-07-26 r3841 [x86_64-linux-3.16.3-unRAID] (local build)

Copyright © 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

 

=== START OF OFFLINE IMMEDIATE AND SELF-TEST SECTION ===

Sending command: "Execute SMART Extended self-test routine immediately in off-line mode".

Drive command "Execute SMART Extended self-test routine immediately in off-line mode" successful.

Testing has begun.

Please wait 727 minutes for test to complete.

Test will complete after Fri Nov 14 11:20:16 2014

 

Use smartctl -X to abort test.

 

Since it drops me back to the prompt do I get a notification of when it finishes? Once it's done I just run smartctl -a again and re-post the results?

 

smartctl_a.txt

Link to comment

The drive itself is fine.

There may be a firmware or driver incompatibility or some kind of weak cable connection.

 

See also...

http://lime-technology.com/forum/index.php?topic=36065.msg335979#msg335979

 

I am running a Norco 4224, so don't have direct cable attachments. The drive bay is in good and secure and other drives on that backplane have no issue, so I am guessing that is not the issue (as with my overkill power supply).

 

Thanks for taking the time to review though... I appreciate it.

 

Link to comment

The drive itself is fine.

There may be a firmware or driver incompatibility or some kind of weak cable connection.

 

See also...

http://lime-technology.com/forum/index.php?topic=36065.msg335979#msg335979

 

I am running a Norco 4224, so don't have direct cable attachments. The drive bay is in good and secure and other drives on that backplane have no issue, so I am guessing that is not the issue (as with my overkill power supply).

 

Thanks for taking the time to review though... I appreciate it.

 

 

Try a different slot.

Link to comment

The drive itself is fine.

There may be a firmware or driver incompatibility or some kind of weak cable connection.

 

See also...

http://lime-technology.com/forum/index.php?topic=36065.msg335979#msg335979

 

I am running a Norco 4224, so don't have direct cable attachments. The drive bay is in good and secure and other drives on that backplane have no issue, so I am guessing that is not the issue (as with my overkill power supply).

 

Thanks for taking the time to review though... I appreciate it.

 

 

Try a different slot.

 

I guess that couldn't hurt. :)

 

Good idea.

Link to comment

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...