new AOC-SAS2LP-MV8 card installed - keep getting errors - driver error?


Recommended Posts

So i've recently installed a AOC-SAS2LP-MV8. (I have a AOC-SASLP-MV8 (so no 2) installed that works fine) have disabled INT13 on both, all drives get detected on bootup and all is fine. I keep trying to do a parity check, however every time now after about an hour or so one of the drives connected to the new card will suddenly start throwing errors

 

Seems to start with a 'SAS error 8a'

All is fine until the below happens, and then they all start going one by one. Smart tests of the drives comes back fine, also the drives work fine if i swap them onto the other card, so im like 99% its the card - or to do with the card.

 

What can i do here?

Mar  5 09:52:18 jbox kernel: drivers/scsi/mvsas/mv_sas.c 1967:Release slot [1] tag[1], task [ffff880015ea1680]:
Mar  5 09:52:18 jbox kernel: sas: sas_ata_task_done: SAS error 8a
Mar  5 09:52:19 jbox kernel: sas: Enter sas_scsi_recover_host busy: 1 failed: 1
Mar  5 09:52:19 jbox kernel: sas: ata14: end_device-1:7: cmd error handler
Mar  5 09:52:19 jbox kernel: sas: ata7: end_device-1:0: dev error handler
Mar  5 09:52:19 jbox kernel: sas: ata8: end_device-1:1: dev error handler
Mar  5 09:52:19 jbox kernel: sas: ata9: end_device-1:2: dev error handler
Mar  5 09:52:19 jbox kernel: sas: ata10: end_device-1:3: dev error handler
Mar  5 09:52:19 jbox kernel: sas: ata11: end_device-1:4: dev error handler
Mar  5 09:52:19 jbox kernel: sas: ata12: end_device-1:5: dev error handler
Mar  5 09:52:19 jbox kernel: sas: ata13: end_device-1:6: dev error handler
Mar  5 09:52:19 jbox kernel: sas: ata14: end_device-1:7: dev error handler
Mar  5 09:52:19 jbox kernel: sas: sas_ata_task_done: SAS error 8a
Mar  5 09:52:19 jbox kernel: ata14: failed to read log page 10h (errno=-5)
Mar  5 09:52:19 jbox kernel: ata14.00: exception Emask 0x1 SAct 0x8000 SErr 0x0 action 0x6
Mar  5 09:52:19 jbox kernel: ata14.00: failed command: READ FPDMA QUEUED
Mar  5 09:52:19 jbox kernel: ata14.00: cmd 60/08:00:a0:94:4b/00:00:30:01:00/40 tag 15 ncq 4096 in
Mar  5 09:52:19 jbox kernel:         res 01/04:74:70:a5:46/00:00:30:01:00/40 Emask 0x3 (HSM violation)
Mar  5 09:52:19 jbox kernel: ata14.00: status: { ERR }
Mar  5 09:52:19 jbox kernel: ata14.00: error: { ABRT }
Mar  5 09:52:19 jbox kernel: ata14: hard resetting link
Mar  5 09:52:19 jbox kernel: drivers/scsi/mvsas/mv_94xx.c 625:command active 000000F4,  slot [3].
Mar  5 09:52:19 jbox kernel: drivers/scsi/mvsas/mv_94xx.c 625:command active 000000F0,  slot [0].
Mar  5 09:52:19 jbox kernel: drivers/scsi/mvsas/mv_94xx.c 625:command active 00000090,  slot [1].
Mar  5 09:52:19 jbox kernel: drivers/scsi/mvsas/mv_94xx.c 625:command active 00000010,  slot [2].
Mar  5 09:52:19 jbox kernel: drivers/scsi/mvsas/mv_94xx.c 625:command active 00000010,  slot [5].
Mar  5 09:52:19 jbox kernel: sas: sas_ata_task_done: SAS error 8a
Mar  5 09:52:19 jbox kernel: sas: sas_ata_task_done: SAS error 8a
Mar  5 09:52:19 jbox kernel: ata14.00: both IDENTIFYs aborted, assuming NODEV
Mar  5 09:52:19 jbox kernel: ata14.00: revalidation failed (errno=-2)
Mar  5 09:52:19 jbox kernel: drivers/scsi/mvsas/mv_94xx.c 625:command active 00000010,  slot [6].
Mar  5 09:52:19 jbox kernel: sas: sas_ata_task_done: SAS error 8a
Mar  5 09:52:19 jbox kernel: drivers/scsi/mvsas/mv_94xx.c 625:command active 00000010,  slot [7].
Mar  5 09:52:20 jbox kernel: drivers/scsi/mvsas/mv_sas.c 1967:Release slot [4] tag[4], task [ffff880081a7eb40]:
Mar  5 09:52:20 jbox kernel: drivers/scsi/mvsas/mv_94xx.c 625:command active 00000010,  slot [4].
Mar  5 09:52:20 jbox kernel: sas: sas_ata_task_done: SAS error 8a
Mar  5 09:52:20 jbox kernel: sas: Enter sas_scsi_recover_host busy: 2 failed: 2
Mar  5 09:52:20 jbox kernel: sas: ata15: end_device-8:0: cmd error handler
Mar  5 09:52:20 jbox kernel: sas: ata17: end_device-8:2: cmd error handler
Mar  5 09:52:20 jbox kernel: sas: ata15: end_device-8:0: dev error handler
Mar  5 09:52:20 jbox kernel: ata15.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6
Mar  5 09:52:20 jbox kernel: sas: ata16: end_device-8:1: dev error handler
Mar  5 09:52:20 jbox kernel: ata15.00: failed command: READ DMA EXT
Mar  5 09:52:20 jbox kernel: ata15.00: cmd 25/00:00:10:d8:a7/00:02:03:00:00/e0 tag 7 dma 262144 in
Mar  5 09:52:20 jbox kernel:         res 01/04:00:0f:d8:a7/00:00:03:00:00/e0 Emask 0x2 (HSM violation)
Mar  5 09:52:20 jbox kernel: sas: ata17: end_device-8:2: dev error handler
Mar  5 09:52:20 jbox kernel: ata15.00: status: { ERR }
Mar  5 09:52:20 jbox kernel: ata15.00: error: { ABRT }
Mar  5 09:52:20 jbox kernel: ata15: hard resetting link
Mar  5 09:52:20 jbox kernel: sas: sas_ata_task_done: SAS error 8a
Mar  5 09:52:20 jbox kernel: ata17: failed to read log page 10h (errno=-5)
Mar  5 09:52:20 jbox kernel: sas: ata18: end_device-8:3: dev error handler
Mar  5 09:52:20 jbox kernel: ata17.00: exception Emask 0x1 SAct 0x40000000 SErr 0x0 action 0x6
Mar  5 09:52:20 jbox kernel: ata17.00: failed command: WRITE FPDMA QUEUED
Mar  5 09:52:20 jbox kernel: ata17.00: cmd 61/e0:00:a0:a5:76/01:00:00:00:00/40 tag 30 ncq 245760 out
Mar  5 09:52:20 jbox kernel:         res 01/04:50:10:d6:a7/00:00:03:00:00/40 Emask 0x3 (HSM violation)
Mar  5 09:52:20 jbox kernel: ata17.00: status: { ERR }
Mar  5 09:52:20 jbox kernel: ata17.00: error: { ABRT }
Mar  5 09:52:20 jbox kernel: ata17: hard resetting link
Mar  5 09:52:20 jbox kernel: mvsas 0000:01:00.0: Phy7 : No sig fis
Mar  5 09:52:21 jbox kernel: mvsas 0000:02:00.0: Phy0 : No sig fis
Mar  5 09:52:21 jbox kernel: sas: sas_ata_task_done: SAS error 8a
Mar  5 09:52:21 jbox kernel: sas: sas_ata_task_done: SAS error 8a
Mar  5 09:52:21 jbox kernel: ata17.00: both IDENTIFYs aborted, assuming NODEV
Mar  5 09:52:21 jbox kernel: ata17.00: revalidation failed (errno=-2)
Mar  5 09:52:24 jbox kernel: ata14: hard resetting link
Mar  5 09:52:24 jbox kernel: drivers/scsi/mvsas/mv_sas.c 1967:Release slot [0] tag[0], task [ffff880081a7e780]:
Mar  5 09:52:24 jbox kernel: sas: sas_ata_task_done: SAS error 8a
Mar  5 09:52:24 jbox kernel: ata14.00: failed to IDENTIFY (I/O error, err_mask=0x11)
Mar  5 09:52:24 jbox kernel: ata14.00: revalidation failed (errno=-5)
Mar  5 09:52:26 jbox kernel: ata17: hard resetting link
Mar  5 09:52:26 jbox kernel: ata15.00: qc timeout (cmd 0xec)
Mar  5 09:52:26 jbox kernel: ata15.00: failed to IDENTIFY (I/O error, err_mask=0x5)
Mar  5 09:52:26 jbox kernel: ata15.00: revalidation failed (errno=-5)
Mar  5 09:52:26 jbox kernel: ata15: hard resetting link
Mar  5 09:52:26 jbox kernel: sas: sas_form_port: phy0 belongs to port0 already(1)!
Mar  5 09:52:26 jbox kernel: sas: sas_form_port: phy7 belongs to port7 already(1)!
Mar  5 09:52:28 jbox kernel: drivers/scsi/mvsas/mv_sas.c 1532:mvs_I_T_nexus_reset for device[0]:rc= 0
Mar  5 09:52:29 jbox kernel: ata14: hard resetting link
Mar  5 09:52:29 jbox kernel: ata14.00: configured for UDMA/133
Mar  5 09:52:29 jbox kernel: ata14: EH complete
Mar  5 09:52:29 jbox kernel: sas: --- Exit sas_scsi_recover_host: busy: 0 failed: 0 tries: 1
Mar  5 09:52:31 jbox kernel: ata17.00: qc timeout (cmd 0x27)
Mar  5 09:52:31 jbox kernel: ata17.00: failed to read native max address (err_mask=0x4)
Mar  5 09:52:31 jbox kernel: ata17.00: HPA support seems broken, skipping HPA handling
Mar  5 09:52:31 jbox kernel: ata17.00: revalidation failed (errno=-5)
Mar  5 09:52:31 jbox kernel: ata17: hard resetting link
Mar  5 09:52:31 jbox kernel: sas: sas_form_port: phy2 belongs to port2 already(1)!
Mar  5 09:52:33 jbox kernel: drivers/scsi/mvsas/mv_sas.c 1532:mvs_I_T_nexus_reset for device[2]:rc= 0
Mar  5 09:52:33 jbox kernel: drivers/scsi/mvsas/mv_94xx.c 625:command active 00000002,  slot [0].
Mar  5 09:52:37 jbox kernel: mdcmd (48): nocheck 
Mar  5 09:52:38 jbox kernel: ata15.00: qc timeout (cmd 0xec)
Mar  5 09:52:38 jbox kernel: ata15.00: failed to IDENTIFY (I/O error, err_mask=0x5)
Mar  5 09:52:38 jbox kernel: ata15.00: revalidation failed (errno=-5)
Mar  5 09:52:38 jbox kernel: ata15: hard resetting link
Mar  5 09:52:38 jbox kernel: ata17.00: qc timeout (cmd 0xef)
Mar  5 09:52:38 jbox kernel: ata17.00: failed to set xfermode (err_mask=0x4)
Mar  5 09:52:38 jbox kernel: ata17.00: disabled
Mar  5 09:52:38 jbox kernel: ata17: hard resetting link
Mar  5 09:52:38 jbox kernel: sas: sas_form_port: phy2 belongs to port2 already(1)!
Mar  5 09:52:39 jbox kernel: sas: sas_form_port: phy0 belongs to port0 already(1)!
Mar  5 09:52:40 jbox kernel: drivers/scsi/mvsas/mv_sas.c 1532:mvs_I_T_nexus_reset for device[0]:rc= 0
Mar  5 09:52:40 jbox kernel: drivers/scsi/mvsas/mv_sas.c 1532:mvs_I_T_nexus_reset for device[2]:rc= 0
Mar  5 09:52:41 jbox kernel: drivers/scsi/mvsas/mv_94xx.c 625:command active 00000002,  slot [0].
Mar  5 09:52:41 jbox kernel: ata17: EH complete
Mar  5 09:52:46 jbox kernel: ata15.00: qc timeout (cmd 0x27)
Mar  5 09:52:46 jbox kernel: ata15.00: failed to read native max address (err_mask=0x4)
Mar  5 09:52:46 jbox kernel: ata15.00: HPA support seems broken, skipping HPA handling
Mar  5 09:52:46 jbox kernel: ata15.00: revalidation failed (errno=-5)
Mar  5 09:52:46 jbox kernel: ata15.00: disabled
Mar  5 09:52:46 jbox kernel: ata15: hard resetting link
Mar  5 09:52:46 jbox kernel: sas: sas_form_port: phy0 belongs to port0 already(1)!
Mar  5 09:52:48 jbox kernel: drivers/scsi/mvsas/mv_sas.c 1532:mvs_I_T_nexus_reset for device[0]:rc= 0
Mar  5 09:52:48 jbox kernel: ata15: EH complete
Mar  5 09:52:48 jbox kernel: sas: --- Exit sas_scsi_recover_host: busy: 0 failed: 0 tries: 1
Mar  5 09:52:48 jbox kernel: sd 8:0:0:0: [sdj]  
Mar  5 09:52:48 jbox kernel: Result: hostbyte=0x04 driverbyte=0x00
Mar  5 09:52:48 jbox kernel: sd 8:0:0:0: [sdj] CDB: 
Mar  5 09:52:48 jbox kernel: cdb[0]=0x88: 88 00 00 00 00 00 03 a7 d8 10 00 00 02 00 00 00
Mar  5 09:52:48 jbox kernel: blk_update_request: I/O error, dev sdj, sector 61331472
Mar  5 09:52:48 jbox kernel: md: disk2 read error, sector=61331408
Mar  5 09:52:48 jbox kernel: md: multiple disk errors, sector=61331408
Mar  5 09:52:48 jbox kernel: md: disk2 read error, sector=61331416

Link to comment

an excerpt from here: http://tali.admingilde.org/linux-docbook/libata/ch07.html

 

HSM violation

 

This error is indicated when STATUS value doesn't match HSM requirement during issuing or excution any ATA/ATAPI command.

 

Examples

 

ATA_STATUS doesn't contain !BSY && DRDY && !DRQ while trying to issue a command.

 

!BSY && !DRQ during PIO data transfer.

 

DRQ on command completion.

 

!BSY && ERR after CDB tranfer starts but before the last byte of CDB is transferred. ATA/ATAPI standard states that "The device shall not terminate the PACKET command with an error before the last byte of the command packet has been written" in the error outputs description of PACKET command and the state diagram doesn't include such transitions.

 

In these cases, HSM is violated and not much information regarding the error can be acquired from STATUS or ERROR register. IOW, this error can be anything - driver bug, faulty device, controller and/or cable.

 

As HSM is violated, reset is necessary to restore known state. Reconfiguring transport for lower speed might be helpful too as transmission errors sometimes cause this kind of errors.

 

Its talking about lowering the speed for the drives, which most likely isn't possible.  But, I would reseat all of the cables to both the HBA and the backplane / drives.

 

Also, if you've made the wiring nice and pretty with tie straps, I would cut them all and just let the cables fall where they may (especially if you had power lines tied to the data lines)

Link to comment

I'm also having similar problems as you have with my AOC-SAS2LP-MV8. I have changed cables, but it didn't change anything.

It looks for me that it might be that my WD WD60EFRX and AOC-SAS2LP-MV8 doesn't like each other.

For me the errors happens when the disk is spinning down.

I'm going to move the disk to the controller on the mainboard to see if that changes anything.

 

Here is what's  in my log:

Mar  1 20:20:41 Server1 kernel: mdcmd (47): spindown 4
Mar  1 20:20:50 Server1 kernel: mdcmd (48): spindown 3
Mar  1 20:21:31 Server1 kernel: mdcmd (49): spindown 6
Mar  1 20:23:36 Server1 kernel: mdcmd (50): spindown 1
Mar  1 20:24:04 Server1 kernel: mdcmd (51): spindown 2
Mar  1 20:26:42 Server1 kernel: drivers/scsi/mvsas/mv_sas.c 1967:Release slot [0] tag[0], task [ffff88011afe1a40]:
Mar  1 20:26:42 Server1 kernel: drivers/scsi/mvsas/mv_94xx.c 625:command active 000007FF,  slot [0].
Mar  1 20:26:42 Server1 kernel: sas: sas_ata_task_done: SAS error 8a
Mar  1 20:26:42 Server1 kernel: drivers/scsi/mvsas/mv_sas.c 1967:Release slot [1] tag[1], task [ffff880408ba7180]:
Mar  1 20:26:42 Server1 kernel: drivers/scsi/mvsas/mv_94xx.c 625:command active 000007FE,  slot [1].
Mar  1 20:26:42 Server1 kernel: sas: sas_ata_task_done: SAS error 8a
Mar  1 20:26:42 Server1 kernel: drivers/scsi/mvsas/mv_sas.c 1967:Release slot [2] tag[2], task [ffff880408ba68c0]:
Mar  1 20:26:42 Server1 kernel: drivers/scsi/mvsas/mv_94xx.c 625:command active 000007FC,  slot [2].
Mar  1 20:26:42 Server1 kernel: sas: sas_ata_task_done: SAS error 8a
Mar  1 20:26:42 Server1 kernel: drivers/scsi/mvsas/mv_sas.c 1967:Release slot [3] tag[3], task [ffff880408ba7900]:
Mar  1 20:26:42 Server1 kernel: drivers/scsi/mvsas/mv_94xx.c 625:command active 000007F8,  slot [3].
Mar  1 20:26:42 Server1 kernel: sas: sas_ata_task_done: SAS error 8a
Mar  1 20:26:42 Server1 kernel: drivers/scsi/mvsas/mv_sas.c 1967:Release slot [4] tag[4], task [ffff880408ba6780]:
Mar  1 20:26:42 Server1 kernel: drivers/scsi/mvsas/mv_94xx.c 625:command active 000007F0,  slot [4].
Mar  1 20:26:42 Server1 kernel: sas: sas_ata_task_done: SAS error 8a
Mar  1 20:26:42 Server1 kernel: drivers/scsi/mvsas/mv_sas.c 1967:Release slot [5] tag[5], task [ffff880408ba6dc0]:
Mar  1 20:26:42 Server1 kernel: drivers/scsi/mvsas/mv_94xx.c 625:command active 000007E0,  slot [5].
Mar  1 20:26:42 Server1 kernel: sas: sas_ata_task_done: SAS error 8a
Mar  1 20:26:42 Server1 kernel: drivers/scsi/mvsas/mv_sas.c 1967:Release slot [6] tag[6], task [ffff880408ba77c0]:
Mar  1 20:26:42 Server1 kernel: drivers/scsi/mvsas/mv_94xx.c 625:command active 000007C0,  slot [6].
Mar  1 20:26:42 Server1 kernel: sas: sas_ata_task_done: SAS error 8a
Mar  1 20:26:42 Server1 kernel: drivers/scsi/mvsas/mv_sas.c 1967:Release slot [7] tag[7], task [ffff880408ba6f00]:
Mar  1 20:26:42 Server1 kernel: drivers/scsi/mvsas/mv_94xx.c 625:command active 00000780,  slot [7].
Mar  1 20:26:42 Server1 kernel: sas: sas_ata_task_done: SAS error 8a
Mar  1 20:26:42 Server1 kernel: drivers/scsi/mvsas/mv_sas.c 1967:Release slot [8] tag[8], task [ffff880408ba7680]:
Mar  1 20:26:42 Server1 kernel: drivers/scsi/mvsas/mv_94xx.c 625:command active 00000700,  slot [8].
Mar  1 20:26:42 Server1 kernel: sas: sas_ata_task_done: SAS error 8a
Mar  1 20:26:42 Server1 kernel: drivers/scsi/mvsas/mv_sas.c 1967:Release slot [9] tag[9], task [ffff880408ba7540]:
Mar  1 20:26:42 Server1 kernel: drivers/scsi/mvsas/mv_94xx.c 625:command active 00000600,  slot [9].
Mar  1 20:26:42 Server1 kernel: sas: sas_ata_task_done: SAS error 8a
Mar  1 20:26:42 Server1 kernel: drivers/scsi/mvsas/mv_sas.c 1967:Release slot [a] tag[a], task [ffff880408ba63c0]:
Mar  1 20:26:42 Server1 kernel: drivers/scsi/mvsas/mv_94xx.c 625:command active 00000400,  slot [a].
Mar  1 20:26:42 Server1 kernel: sas: sas_ata_task_done: SAS error 8a
Mar  1 20:26:42 Server1 kernel: sas: Enter sas_scsi_recover_host busy: 11 failed: 11
Mar  1 20:26:42 Server1 kernel: sas: ata11: end_device-1:4: cmd error handler
Mar  1 20:26:42 Server1 kernel: sas: ata7: end_device-1:0: dev error handler
Mar  1 20:26:42 Server1 kernel: sas: ata8: end_device-1:1: dev error handler
Mar  1 20:26:42 Server1 kernel: sas: ata9: end_device-1:2: dev error handler
Mar  1 20:26:42 Server1 kernel: sas: ata10: end_device-1:3: dev error handler
Mar  1 20:26:42 Server1 kernel: sas: ata11: end_device-1:4: dev error handler
Mar  1 20:26:42 Server1 kernel: ata11.00: exception Emask 0x0 SAct 0x1ffc000 SErr 0x0 action 0x6
Mar  1 20:26:42 Server1 kernel: ata11.00: failed command: READ FPDMA QUEUED
Mar  1 20:26:42 Server1 kernel: ata11.00: cmd 60/08:00:78:7b:00/00:00:80:01:00/40 tag 14 ncq 4096 in
Mar  1 20:26:42 Server1 kernel:         res 01/04:00:00:00:00/00:00:00:00:00/a0 Emask 0x2 (HSM violation)
Mar  1 20:26:42 Server1 kernel: ata11.00: status: { ERR }
Mar  1 20:26:42 Server1 kernel: ata11.00: error: { ABRT }
Mar  1 20:26:42 Server1 kernel: ata11.00: failed command: WRITE FPDMA QUEUED
Mar  1 20:26:42 Server1 kernel: ata11.00: cmd 61/00:00:20:93:d4/04:00:8f:00:00/40 tag 15 ncq 524288 out
Mar  1 20:26:42 Server1 kernel:         res 41/10:00:20:93:d4/00:00:8f:00:00/40 Emask 0x481 (invalid argument) <F>
Mar  1 20:26:42 Server1 kernel: ata11.00: status: { DRDY ERR }
Mar  1 20:26:42 Server1 kernel: ata11.00: error: { IDNF }
Mar  1 20:26:42 Server1 kernel: ata11.00: failed command: WRITE FPDMA QUEUED
Mar  1 20:26:42 Server1 kernel: ata11.00: cmd 61/00:00:20:97:d4/04:00:8f:00:00/40 tag 16 ncq 524288 out
Mar  1 20:26:42 Server1 kernel:         res 01/04:00:00:00:00/00:00:00:00:00/a0 Emask 0x2 (HSM violation)
Mar  1 20:26:42 Server1 kernel: ata11.00: status: { ERR }
Mar  1 20:26:42 Server1 kernel: ata11.00: error: { ABRT }
Mar  1 20:26:42 Server1 kernel: ata11.00: failed command: WRITE FPDMA QUEUED
Mar  1 20:26:42 Server1 kernel: ata11.00: cmd 61/00:00:20:9b:d4/04:00:8f:00:00/40 tag 17 ncq 524288 out
Mar  1 20:26:42 Server1 kernel:         res 01/04:00:00:00:00/00:00:00:00:00/a0 Emask 0x2 (HSM violation)
Mar  1 20:26:42 Server1 kernel: ata11.00: status: { ERR }
Mar  1 20:26:42 Server1 kernel: ata11.00: error: { ABRT }
Mar  1 20:26:42 Server1 kernel: ata11.00: failed command: WRITE FPDMA QUEUED
Mar  1 20:26:42 Server1 kernel: ata11.00: cmd 61/00:00:20:9f:d4/04:00:8f:00:00/40 tag 18 ncq 524288 out
Mar  1 20:26:42 Server1 kernel:         res 01/04:00:00:00:00/00:00:00:00:00/a0 Emask 0x2 (HSM violation)
Mar  1 20:26:42 Server1 kernel: ata11.00: status: { ERR }
Mar  1 20:26:42 Server1 kernel: ata11.00: error: { ABRT }
Mar  1 20:26:42 Server1 kernel: ata11.00: failed command: WRITE FPDMA QUEUED
Mar  1 20:26:42 Server1 kernel: ata11.00: cmd 61/00:00:20:a3:d4/04:00:8f:00:00/40 tag 19 ncq 524288 out
Mar  1 20:26:42 Server1 kernel:         res 01/04:00:00:00:00/00:00:00:00:00/a0 Emask 0x2 (HSM violation)
Mar  1 20:26:42 Server1 kernel: ata11.00: status: { ERR }
Mar  1 20:26:42 Server1 kernel: ata11.00: error: { ABRT }
Mar  1 20:26:42 Server1 kernel: ata11.00: failed command: WRITE FPDMA QUEUED
Mar  1 20:26:42 Server1 kernel: ata11.00: cmd 61/00:00:20:a7:d4/04:00:8f:00:00/40 tag 20 ncq 524288 out
Mar  1 20:26:42 Server1 kernel:         res 01/04:00:00:00:00/00:00:00:00:00/a0 Emask 0x2 (HSM violation)
Mar  1 20:26:42 Server1 kernel: ata11.00: status: { ERR }
Mar  1 20:26:42 Server1 kernel: ata11.00: error: { ABRT }
Mar  1 20:26:42 Server1 kernel: ata11.00: failed command: WRITE FPDMA QUEUED
Mar  1 20:26:42 Server1 kernel: ata11.00: cmd 61/00:00:20:ab:d4/04:00:8f:00:00/40 tag 21 ncq 524288 out
Mar  1 20:26:42 Server1 kernel:         res 01/04:00:00:00:00/00:00:00:00:00/a0 Emask 0x2 (HSM violation)
Mar  1 20:26:42 Server1 kernel: ata11.00: status: { ERR }
Mar  1 20:26:42 Server1 kernel: ata11.00: error: { ABRT }
Mar  1 20:26:42 Server1 kernel: ata11.00: failed command: WRITE FPDMA QUEUED
Mar  1 20:26:42 Server1 kernel: ata11.00: cmd 61/00:00:20:af:d4/04:00:8f:00:00/40 tag 22 ncq 524288 out
Mar  1 20:26:42 Server1 kernel:         res 01/04:00:00:00:00/00:00:00:00:00/a0 Emask 0x2 (HSM violation)
Mar  1 20:26:42 Server1 kernel: ata11.00: status: { ERR }
Mar  1 20:26:42 Server1 kernel: ata11.00: error: { ABRT }
Mar  1 20:26:42 Server1 kernel: ata11.00: failed command: WRITE FPDMA QUEUED
Mar  1 20:26:42 Server1 kernel: ata11.00: cmd 61/00:00:20:b3:d4/04:00:8f:00:00/40 tag 23 ncq 524288 out
Mar  1 20:26:42 Server1 kernel:         res 01/04:00:00:00:00/00:00:00:00:00/a0 Emask 0x2 (HSM violation)
Mar  1 20:26:42 Server1 kernel: ata11.00: status: { ERR }
Mar  1 20:26:42 Server1 kernel: ata11.00: error: { ABRT }
Mar  1 20:26:42 Server1 kernel: ata11.00: failed command: WRITE FPDMA QUEUED
Mar  1 20:26:42 Server1 kernel: ata11.00: cmd 61/00:00:20:b7:d4/03:00:8f:00:00/40 tag 24 ncq 393216 out
Mar  1 20:26:42 Server1 kernel:         res 01/04:00:00:00:00/00:00:00:00:00/a0 Emask 0x2 (HSM violation)
Mar  1 20:26:42 Server1 kernel: ata11.00: status: { ERR }
Mar  1 20:26:42 Server1 kernel: ata11.00: error: { ABRT }
Mar  1 20:26:42 Server1 kernel: ata11: hard resetting link
Mar  1 20:26:43 Server1 kernel: ata11.00: configured for UDMA/133
Mar  1 20:26:43 Server1 kernel: ata11: EH complete
Mar  1 20:26:43 Server1 kernel: sas: --- Exit sas_scsi_recover_host: busy: 0 failed: 0 tries: 1

Link to comment

Check for firmware and BIOS updates.

 

The card firmware and bios is up to date.

 

Interestingly, if i remove a certain drive i get no errors - so im starting to wonder if its a certain bay causing the errors for all the other drives. Going to put a known working drive into that bay and see what happens.

 

Edit: so it seems so far that it was a specific drive causing the other drives on the same backplane to fail. Anyone got any ideas as to whats happening here? bit clueless.

Link to comment

It looks for me that it might be that my WD WD60EFRX and AOC-SAS2LP-MV8 doesn't like each other.

For me the errors happens when the disk is spinning down.

 

Both me and my brother have had this problem for some time in our unRAID systems which began when we started putting WD WD60EFRX drives into our system connected to an AOC-SASLP-MV8 board.

 

Setting these drive to not spin down made the problems go away; so a work-around but not an ideal solution (especially when I do a new config and forget to turn off the spin down on these drives and the have to do a file system check and check/rebuild parity when it subsequently goes wrong).

 

However my brother has not had the problem since he left the drives with the normal default spin down on the latest beta14b,  whilst I on the latest beta have had a problem (after I forget to turn off the spin down after a new config) and therefore resorted back to ensuring my WD WD60EFRX drives do not spin down.

 

Link to comment

Check for firmware and BIOS updates.

 

The card firmware and bios is up to date.

 

Interestingly, if i remove a certain drive i get no errors - so im starting to wonder if its a certain bay causing the errors for all the other drives. Going to put a known working drive into that bay and see what happens.

 

Edit: so it seems so far that it was a specific drive causing the other drives on the same backplane to fail. Anyone got any ideas as to whats happening here? bit clueless.

 

Which disk was causing the trouble for you?

I have not had any more errors after moving my WD RED 6TB disk to the sata ports on the mainboard.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.