ridewithjoe Posted May 17, 2017 Share Posted May 17, 2017 Seeing a lot of the following errors and trying to figure out what may be causing them. "exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen". I've attached a diagnostics dump. Almost seems like the embedded mainboard sata controllers are choking. Thats just a guess though. Any thoughts on a direction are appreciated. nasvm-diagnostics-20170517-1727.zip Quote Link to comment
JorgeB Posted May 17, 2017 Share Posted May 17, 2017 You're having multiple issues: 1) the 4 disks connected on the port multiplier were missing at boot-up: May 13 00:27:27 nasvm kernel: mdcmd (10): import 9 May 13 00:27:27 nasvm kernel: md: import_slot: 9 missing May 13 00:27:27 nasvm kernel: mdcmd (11): import 10 May 13 00:27:27 nasvm kernel: md: import_slot: 10 missing May 13 00:27:27 nasvm kernel: mdcmd (12): import 11 May 13 00:27:27 nasvm kernel: md: import_slot: 11 missing May 13 00:27:27 nasvm kernel: mdcmd (13): import 12 May 13 00:27:27 nasvm kernel: md: import_slot: 12 missing The controller reset itself and the 4 disks appeared, have read of issues with asmedia + port multiplier before, though after this initial hiccup they behaved. May 13 00:27:59 nasvm kernel: ata7: exception Emask 0x10 SAct 0x0 SErr 0x4050000 action 0xe frozen May 13 00:27:59 nasvm kernel: ata7: irq_stat 0x00400040, connection status changed May 13 00:27:59 nasvm kernel: ata7: SError: { PHYRdyChg CommWake DevExch } May 13 00:27:59 nasvm kernel: ata7: hard resetting link May 13 00:28:05 nasvm kernel: ata7: SATA link up 6.0 Gbps (SStatus 133 SControl 300) May 13 00:28:05 nasvm kernel: ata7.15: Port Multiplier 1.2, 0x197b:0x575f r0, 15 ports, feat 0x5/0xf May 13 00:28:05 nasvm kernel: ata7.00: hard resetting link May 13 00:28:19 nasvm kernel: ata7.15: qc timeout (cmd 0xe4) May 13 00:28:19 nasvm kernel: ata7.00: failed to read SCR 0 (Emask=0x4) May 13 00:28:19 nasvm kernel: ata7.00: failed to read SCR 0 (Emask=0x40) May 13 00:28:19 nasvm kernel: ata7.00: failed to read SCR 1 (Emask=0x40) May 13 00:28:19 nasvm kernel: ata7.00: failed to read SCR 0 (Emask=0x40) May 13 00:28:19 nasvm kernel: ata7.01: hard resetting link May 13 00:28:19 nasvm kernel: ata7.01: SATA link up 6.0 Gbps (SStatus 133 SControl 300) May 13 00:28:19 nasvm kernel: ata7.02: hard resetting link May 13 00:28:19 nasvm kernel: ata7.02: SATA link up 6.0 Gbps (SStatus 133 SControl 300) May 13 00:28:19 nasvm kernel: ata7.03: hard resetting link May 13 00:28:20 nasvm kernel: ata7.03: SATA link up 6.0 Gbps (SStatus 133 SControl 300) May 13 00:28:20 nasvm kernel: ata7.04: hard resetting link May 13 00:28:20 nasvm kernel: ata7.04: SATA link down (SStatus 0 SControl 330) May 13 00:28:20 nasvm kernel: ata7.05: hard resetting link May 13 00:28:21 nasvm kernel: ata7.05: SATA link down (SStatus 0 SControl 330) May 13 00:28:21 nasvm kernel: ata7.06: hard resetting link May 13 00:28:21 nasvm kernel: ata7.06: SATA link down (SStatus 0 SControl 330) May 13 00:28:21 nasvm kernel: ata7.07: hard resetting link May 13 00:28:21 nasvm kernel: ata7.07: SATA link down (SStatus 0 SControl 330) May 13 00:28:21 nasvm kernel: ata7.08: hard resetting link May 13 00:28:21 nasvm kernel: ata7.08: SATA link down (SStatus 0 SControl 330) May 13 00:28:21 nasvm kernel: ata7.09: hard resetting link May 13 00:28:22 nasvm kernel: ata7.09: SATA link down (SStatus 0 SControl 330) May 13 00:28:22 nasvm kernel: ata7.10: hard resetting link May 13 00:28:22 nasvm kernel: ata7.10: SATA link down (SStatus 0 SControl 330) May 13 00:28:22 nasvm kernel: ata7.11: hard resetting link May 13 00:28:22 nasvm kernel: ata7.11: SATA link down (SStatus 0 SControl 330) May 13 00:28:22 nasvm kernel: ata7.12: hard resetting link May 13 00:28:23 nasvm kernel: ata7.12: SATA link down (SStatus 0 SControl 330) May 13 00:28:23 nasvm kernel: ata7.13: hard resetting link May 13 00:28:23 nasvm kernel: ata7.13: SATA link down (SStatus 0 SControl 330) May 13 00:28:23 nasvm kernel: ata7.14: hard resetting link May 13 00:28:23 nasvm kernel: ata7.14: SATA link down (SStatus 0 SControl 330) May 13 00:28:23 nasvm kernel: ata7.00: ATA-9: ST3000DM001-1ER166, Z500QE1D, CC25, max UDMA/133 May 13 00:28:23 nasvm kernel: ata7.00: 5860533168 sectors, multi 0: LBA48 NCQ (depth 31/32), AA May 13 00:28:23 nasvm kernel: ata7.00: configured for UDMA/133 May 13 00:28:23 nasvm kernel: ata7.01: ATA-9: WDC WD40EFRX-68WT0N0, WD-WCC4E1VYSE25, 82.00A82, max UDMA/133 May 13 00:28:23 nasvm kernel: ata7.01: 7814037168 sectors, multi 0: LBA48 NCQ (depth 31/32), AA May 13 00:28:23 nasvm kernel: ata7.01: configured for UDMA/133 May 13 00:28:23 nasvm kernel: ata7.02: ATA-9: WDC WD40EFRX-68WT0N0, WD-WCC4ECK3HRUH, 80.00A80, max UDMA/133 May 13 00:28:23 nasvm kernel: ata7.02: 7814037168 sectors, multi 0: LBA48 NCQ (depth 31/32), AA May 13 00:28:23 nasvm kernel: ata7.02: configured for UDMA/133 May 13 00:28:23 nasvm kernel: ata7.03: ATA-9: WDC WD40EFRX-68WT0N0, WD-WCC4ECK3HKDE, 80.00A80, max UDMA/133 May 13 00:28:23 nasvm kernel: ata7.03: 7814037168 sectors, multi 0: LBA48 NCQ (depth 31/32), AA May 13 00:28:23 nasvm kernel: ata7.03: configured for UDMA/133 May 13 00:28:23 nasvm kernel: ata7: EH complete 2) Parity disk is having issues, these look to me like an actual disk problem: May 13 02:59:01 nasvm kernel: ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen May 13 02:59:01 nasvm kernel: ata5.00: failed command: READ DMA EXT May 13 02:59:01 nasvm kernel: ata5.00: cmd 25/00:40:d8:6f:c0/00:05:3c:02:00/e0 tag 20 dma 688128 in May 13 02:59:01 nasvm kernel: res 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) May 13 02:59:01 nasvm kernel: ata5.00: status: { DRDY } May 13 02:59:01 nasvm kernel: ata5: hard resetting link May 13 02:59:10 nasvm kernel: ata5: SATA link up 6.0 Gbps (SStatus 133 SControl 300) May 13 02:59:10 nasvm kernel: ata5.00: configured for UDMA/133 May 13 02:59:10 nasvm kernel: ata5: EH complete May 13 03:00:25 nasvm kernel: ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 May 13 03:00:25 nasvm kernel: ata5.00: irq_stat 0x40000001 May 13 03:00:25 nasvm kernel: ata5.00: failed command: READ DMA EXT May 13 03:00:25 nasvm kernel: ata5.00: cmd 25/00:40:78:33:c0/00:05:3c:02:00/e0 tag 25 dma 688128 in May 13 03:00:25 nasvm kernel: res 53/40:00:78:36:c0/00:00:3c:02:00/00 Emask 0x8 (media error) May 13 03:00:25 nasvm kernel: ata5.00: status: { DRDY SENSE ERR } May 13 03:00:25 nasvm kernel: ata5.00: error: { UNC } May 13 03:00:25 nasvm kernel: ata5.00: configured for UDMA/133 May 13 03:00:25 nasvm kernel: sd 5:0:0:0: [sdf] tag#25 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08 May 13 03:00:25 nasvm kernel: sd 5:0:0:0: [sdf] tag#25 Sense Key : 0x3 [current] May 13 03:00:25 nasvm kernel: sd 5:0:0:0: [sdf] tag#25 ASC=0x11 ASCQ=0x0 May 13 03:00:25 nasvm kernel: sd 5:0:0:0: [sdf] tag#25 CDB: opcode=0x88 88 00 00 00 00 02 3c c0 33 78 00 00 05 40 00 00 May 13 03:00:25 nasvm kernel: blk_update_request: I/O error, dev sdf, sector 9609163640 May 13 03:00:25 nasvm kernel: md: disk0 read error, sector=9609163576 May 13 03:00:25 nasvm kernel: md: disk0 read error, sector=9609163584 May 13 03:00:25 nasvm kernel: md: disk0 read error, sector=9609163592 May 13 03:00:25 nasvm kernel: md: disk0 read error, sector=9609163600 May 13 03:00:25 nasvm kernel: md: disk0 read error, sector=9609163608 May 13 03:00:25 nasvm kernel: md: disk0 read error, sector=9609163616 May 13 03:00:25 nasvm kernel: ata5: EH complete 3) the 4 disks on the Marvell controller are timing out multiple times: May 14 16:08:43 nasvm kernel: ata13.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen May 14 16:08:43 nasvm kernel: ata13.00: failed command: IDENTIFY DEVICE May 14 16:08:43 nasvm kernel: ata13.00: cmd ec/00:01:00:00:00/00:00:00:00:00/00 tag 16 pio 512 in May 14 16:08:43 nasvm kernel: res 40/00:01:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) May 14 16:08:43 nasvm kernel: ata13.00: status: { DRDY } May 14 16:08:43 nasvm kernel: ata13: hard resetting link May 14 16:08:43 nasvm kernel: ata13: SATA link up 6.0 Gbps (SStatus 133 SControl 300) May 14 16:08:43 nasvm kernel: ata13.00: configured for UDMA/133 May 14 16:08:43 nasvm kernel: ata13: EH complete May 14 16:39:07 nasvm kernel: ata11.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen May 14 16:39:07 nasvm kernel: ata11.00: failed command: SMART May 14 16:39:07 nasvm kernel: ata11.00: cmd b0/d1:01:01:4f:c2/00:00:00:00:00/00 tag 9 pio 512 in May 14 16:39:07 nasvm kernel: res 40/00:ff:ff:00:00/00:00:00:00:00/40 Emask 0x4 (timeout) May 14 16:39:07 nasvm kernel: ata11.00: status: { DRDY } May 14 16:39:07 nasvm kernel: ata11: hard resetting link May 14 16:39:08 nasvm kernel: ata11: SATA link up 6.0 Gbps (SStatus 133 SControl 300) May 14 16:39:08 nasvm kernel: ata11.00: configured for UDMA/133 May 14 16:39:08 nasvm kernel: ata11: EH complete May 14 18:09:36 nasvm kernel: ata14.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen May 14 18:09:36 nasvm kernel: ata14.00: failed command: SMART May 14 18:09:36 nasvm kernel: ata14.00: cmd b0/d0:01:00:4f:c2/00:00:00:00:00/00 tag 12 pio 512 in May 14 18:09:36 nasvm kernel: res 40/00:01:01:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) May 14 18:09:36 nasvm kernel: ata14.00: status: { DRDY } May 14 18:09:36 nasvm kernel: ata14: hard resetting link May 14 18:09:36 nasvm kernel: ata14: SATA link up 6.0 Gbps (SStatus 133 SControl 300) May 14 18:09:36 nasvm kernel: ata14.00: configured for UDMA/133 May 14 18:09:36 nasvm kernel: ata14: EH complete ... May 14 22:11:47 nasvm kernel: ata12.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen May 14 22:11:47 nasvm kernel: ata12.00: failed command: SMART May 14 22:11:47 nasvm kernel: ata12.00: cmd b0/d0:01:00:4f:c2/00:00:00:00:00/00 tag 9 pio 512 in May 14 22:11:47 nasvm kernel: res 40/00:ff:00:00:00/00:00:00:00:00/40 Emask 0x4 (timeout) May 14 22:11:47 nasvm kernel: ata12.00: status: { DRDY } May 14 22:11:47 nasvm kernel: ata12: hard resetting link May 14 22:11:48 nasvm kernel: ata12: SATA link up 6.0 Gbps (SStatus 133 SControl 300) May 14 22:11:48 nasvm kernel: ata12.00: configured for UDMA/133 May 14 22:11:48 nasvm kernel: ata12: EH complete This is a know issue with these controllers, advise to replace it with an LSI (get an 8 port one and can rid of the port multiplier also) Quote Link to comment
ridewithjoe Posted May 18, 2017 Author Share Posted May 18, 2017 Thanks for the info. I just ordered an LSI 8 port controller. I suspected that additional cheapie one may be the issue. I'll swap that Parity drive too. I recently got a deal on an 8TB seagate but it's not going to cut it as a parity drive. I'll move it to another slot. That port multiplier problem may be harder to resolve... thats because I have those 4 drives in an external enclosure because I don't have enough drive slots in my tower. 1 Quote Link to comment
DarkHorse Posted June 7, 2017 Share Posted June 7, 2017 I am running into the same issue using my ASRock EP2C602 onboard controllers.... I think the Marvell ones are choking. Everything was working great until I attempted a bulk copy via GbE port from another machine, which must have pushed the controller a bit harder then when I copied files to the array via a WiFi connection. After the copy via GbE ran for a while, the system suddenly hung, GUI was unresponsive and I had to do a hard reset/reboot. And now see errors similar to what you reported. I did a search on the forums for info on the LSI controller board mentioned, but got a lot of confusing hits. Can you point me to the model LSI board you ordered? Does it have to have the firmware refreshed or anything for it to work properly with UnRAID? Thanks! [ 816.212348] ata9.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen [ 816.212355] ata9.00: failed command: IDENTIFY DEVICE [ 816.212363] ata9.00: cmd ec/00:00:00:00:00/00:00:00:00:00/00 tag 11 pio 512 in res 40/00:ff:ff:00:00/00:00:00:00:00/40 Emask 0x4 (timeout) [ 816.212365] ata9.00: status: { DRDY } [ 816.212373] ata9: hard resetting link [ 816.519211] ata9: SATA link up 6.0 Gbps (SStatus 133 SControl 300) [ 816.522209] ata9.00: configured for UDMA/133 [ 816.522261] ata9: EH complete [ 934.991956] ata8.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen [ 934.991962] ata8.00: failed command: IDENTIFY DEVICE [ 934.991970] ata8.00: cmd ec/00:00:00:00:00/00:00:00:00:00/00 tag 8 pio 512 in res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) [ 934.991972] ata8.00: status: { DRDY } [ 934.991980] ata8: hard resetting link [ 940.301701] ata8: link is slow to respond, please be patient (ready=0) [ 945.040519] ata8: COMRESET failed (errno=-16) [ 945.040526] ata8: hard resetting link [ 950.389337] ata8: link is slow to respond, please be patient (ready=0) [ 955.069179] ata8: COMRESET failed (errno=-16) [ 955.069187] ata8: hard resetting link [ 960.421974] ata8: link is slow to respond, please be patient (ready=0) [ 990.074936] ata8: COMRESET failed (errno=-16) [ 990.074945] ata8: limiting SATA link speed to 3.0 Gbps [ 990.074947] ata8: hard resetting link [ 995.115761] ata8: COMRESET failed (errno=-16) [ 995.115769] ata8: reset failed, giving up [ 995.115771] ata8.00: disabled [ 995.115819] ata8: EH complete [ 995.115867] sd 9:0:0:0: [sde] tag#10 UNKNOWN(0x2003) Result: hostbyte=0x04 driverbyte=0x00 [ 995.115884] sd 9:0:0:0: [sde] tag#10 CDB: opcode=0x88 88 00 00 00 00 00 0b 04 f0 c8 00 00 03 00 00 00 [ 995.115888] blk_update_request: I/O error, dev sde, sector 184873160 [ 995.115939] sd 9:0:0:0: [sde] tag#12 UNKNOWN(0x2003) Result: hostbyte=0x04 driverbyte=0x00 [ 995.115942] md: disk0 read error, sector=184873096 [ 995.115950] sd 9:0:0:0: [sde] tag#12 CDB: opcode=0x35 35 00 00 00 00 00 00 00 00 00 [ 995.115957] blk_update_request: I/O error, dev sde, sector 0 [ 995.115979] md: disk0 read error, sector=184873104 [ 995.115996] md: disk0 read error, sector=184873112 [ 995.116002] md: disk0 read error, sector=184873120 [ 995.116004] sd 9:0:0:0: [sde] tag#13 UNKNOWN(0x2003) Result: hostbyte=0x04 driverbyte=0x00 [ 995.116008] sd 9:0:0:0: [sde] tag#13 CDB: opcode=0x35 35 00 00 00 00 00 00 00 00 00 [ 995.116010] md: disk0 read error, sector=184873128 [ 995.116013] blk_update_request: I/O error, dev sde, sector 0 [ 995.116020] md: disk0 read error, sector=184873136 [ 995.116026] md: disk0 read error, sector=184873144 Quote Link to comment
JorgeB Posted June 7, 2017 Share Posted June 7, 2017 (edited) 1 hour ago, DarkHorse said: using my ASRock EP2C602 onboard controllers.... I think the Marvell ones are choking You definitely don't want to use the 4 Marvell ports on those boards, you're going to have disks dropping willy-nilly. Edited June 7, 2017 by johnnie.black Quote Link to comment
dmacias Posted June 7, 2017 Share Posted June 7, 2017 There are firmware updates for the Marvell controllers on the Asrockrack website. 1 Quote Link to comment
DarkHorse Posted June 7, 2017 Share Posted June 7, 2017 (edited) Thanks dmacias... my system currently has 2.3.0.1037 and the newest updated firmware is 2.3.0.1063. Gonna give it a shot. Edited June 7, 2017 by DarkHorse Quote Link to comment
JorgeB Posted June 7, 2017 Share Posted June 7, 2017 12 minutes ago, DarkHorse said: Thanks dmacias... my system currently has 2.3.0.1037 and the newest updated firmware is 2.3.0.1063. Gonna give it a shot. Please lets us know if it helps. 1 Quote Link to comment
DarkHorse Posted June 7, 2017 Share Posted June 7, 2017 Well, quick update so far. Marvell controller firmware updated with no problems. System is currently rebuilding parity drive. While being done, I performed the GbE file copy that caused the issue and everything went fine. Continuing to do multiple gigabytes of file copies, so far so good. I'll come back with another update after I have more info. Thanks for everyone's help. Quote Link to comment
DarkHorse Posted June 8, 2017 Share Posted June 8, 2017 Well, over 1 TB of data copied, all while the parity disk is being stressed due to rebuilding.... not a single error so far. I'm pretty sure the firmware update resolved the issue I was seeing. Thanks again dmacias!!! Quote Link to comment
Skymind Posted August 4, 2021 Share Posted August 4, 2021 I was getting an "Exception Emask 0x0 SAct..." error. I swapped out the data and power cables but the issue remained. Then I realized that the BIOS setting for the SATA controller had somehow been changed to Disabled. I reenabled AHCI for the controller and the error stopped. The constant I/O activity totally stopped as well. It was being seen as an IDE disk rather than a SATA disk by the BIOS I suppose. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.