6.2 beta 21 system lock up error? ATA UnrecovData Handshk


Recommended Posts

my logs are attached, I was running a program on my windows box called Ember Media Manager (it was placing jpg files on my server automatically) and my system locked up. Saw this in the log and now I can't connect. Any thoughts?

 

Jun 7 12:41:15 Hades kernel: ata1.00: ACPI cmd f5/00:00:00:00:00:00 (SECURITY FREEZE LOCK) filtered out
Jun 7 12:41:15 Hades kernel: ata1.00: ACPI cmd b1/c1:00:00:00:00:00 (DEVICE CONFIGURATION OVERLAY) filtered out
Jun 7 12:41:15 Hades kernel: ata1.00: configured for UDMA/133
Jun 7 12:41:15 Hades kernel: ata1: EH complete
Jun 7 12:47:06 Hades kernel: ata1.00: exception Emask 0x10 SAct 0x0 SErr 0x400100 action 0x6 frozen
Jun 7 12:47:06 Hades kernel: ata1.00: irq_stat 0x08000000, interface fatal error
Jun 7 12:47:06 Hades kernel: ata1: SError: { UnrecovData Handshk }
Jun 7 12:47:06 Hades kernel: ata1.00: failed command: WRITE DMA EXT
Jun 7 12:47:06 Hades kernel: ata1.00: cmd 35/00:40:b0:be:67/00:05:34:00:00/e0 tag 24 dma 688128 out
Jun 7 12:47:06 Hades kernel: res 50/00:00:77:73:67/00:00:34:00:00/e4 Emask 0x10 (ATA bus error)
Jun 7 12:47:06 Hades kernel: ata1.00: status: { DRDY }
Jun 7 12:47:06 Hades kernel: ata1: hard resetting link
Jun 7 12:47:06 Hades kernel: ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
Jun 7 12:47:06 Hades kernel: ata1.00: ACPI cmd ef/10:06:00:00:00:00 (SET FEATURES) succeeded
Jun 7 12:47:06 Hades kernel: ata1.00: ACPI cmd f5/00:00:00:00:00:00 (SECURITY FREEZE LOCK) filtered out
Jun 7 12:47:06 Hades kernel: ata1.00: ACPI cmd b1/c1:00:00:00:00:00 (DEVICE CONFIGURATION OVERLAY) filtered out
Jun 7 12:47:06 Hades kernel: ata1.00: ACPI cmd ef/10:06:00:00:00:00 (SET FEATURES) succeeded
Jun 7 12:47:06 Hades kernel: ata1.00: ACPI cmd f5/00:00:00:00:00:00 (SECURITY FREEZE LOCK) filtered out
Jun 7 12:47:06 Hades kernel: ata1.00: ACPI cmd b1/c1:00:00:00:00:00 (DEVICE CONFIGURATION OVERLAY) filtered out
Jun 7 12:47:06 Hades kernel: ata1.00: configured for UDMA/133
Jun 7 12:47:06 Hades kernel: ata1: EH complete
Jun 7 12:52:49 Hades kernel: ata1.00: exception Emask 0x10 SAct 0x0 SErr 0x400100 action 0x6 frozen
Jun 7 12:52:49 Hades kernel: ata1.00: irq_stat 0x08000000, interface fatal error
Jun 7 12:52:49 Hades kernel: ata1: SError: { UnrecovData Handshk }
Jun 7 12:52:49 Hades kernel: ata1.00: failed command: WRITE DMA EXT
Jun 7 12:52:49 Hades kernel: ata1.00: cmd 35/00:40:b8:f3:31/00:05:92:00:00/e0 tag 24 dma 688128 out
Jun 7 12:52:49 Hades kernel: res 50/00:00:d7:79:e3/00:00:48:01:00/e8 Emask 0x10 (ATA bus error)
Jun 7 12:52:49 Hades kernel: ata1.00: status: { DRDY }
Jun 7 12:52:49 Hades kernel: ata1: hard resetting link
Jun 7 12:52:49 Hades kernel: ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
Jun 7 12:52:49 Hades kernel: ata1.00: ACPI cmd ef/10:06:00:00:00:00 (SET FEATURES) succeeded
Jun 7 12:52:49 Hades kernel: ata1.00: ACPI cmd f5/00:00:00:00:00:00 (SECURITY FREEZE LOCK) filtered out
Jun 7 12:52:49 Hades kernel: ata1.00: ACPI cmd b1/c1:00:00:00:00:00 (DEVICE CONFIGURATION OVERLAY) filtered out
Jun 7 12:52:49 Hades kernel: ata1.00: ACPI cmd ef/10:06:00:00:00:00 (SET FEATURES) succeeded
Jun 7 12:52:49 Hades kernel: ata1.00: ACPI cmd f5/00:00:00:00:00:00 (SECURITY FREEZE LOCK) filtered out
Jun 7 12:52:49 Hades kernel: ata1.00: ACPI cmd b1/c1:00:00:00:00:00 (DEVICE CONFIGURATION OVERLAY) filtered out
Jun 7 12:52:49 Hades kernel: ata1.00: configured for UDMA/133
Jun 7 12:52:49 Hades kernel: ata1: EH complete

hades-diagnostics-20160607-1310.zip

Link to comment

based on my own debugging its having trouble with 2 drives (ATA7 and ATA1)... although I am unsure at the moment if its a parity drive ATA1 or not (I have two drives with the same name), not sure of the drive location of my parity though and my ui is locked up.

 

is there a way I can tell if its parity or not? - Not my parity drive, found it in the var log.

 

Also, do you think this is more bad cable? bad port on mobo? not enough power? or just a drive slowly dying?

 

I recently swapped out my mobo for a new one, including an i7 and more ram. Honestly, didn't really pay attention enough prior to know if this was happening before or not (I definitely would have sporadic freezes though). These issues seem to ONLY happen when I try doing something via my windows PC to the machine though. Ive gone 30+ days without issue if I am not messing around with writing to it from a desktop app. CP/Sonarr run without issue for weeks.

 

Since I have a total of 12 SATA ports (4 SATA II and 8 SATA III) I can play around with the ports to check that. Not really sure how to check a possible power issue. I did not swap out my power supply, its a 520w, and I assumed it was enough for 9 drives (1 being SSD).

Link to comment

the following was also on my smart scan of the ATA1 disk:

 

Error 74 occurred at disk power-on lifetime: 23799 hours (991 days + 15 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  84 51 a0 50 c0 67 04  Error: ICRC, ABRT 160 sectors at LBA = 0x0467c050 = 73908304

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  35 00 40 b0 be 67 e0 00   5d+13:05:34.474  WRITE DMA EXT
  25 00 b8 c0 72 67 e0 00   5d+13:05:34.474  READ DMA EXT
  35 00 d8 70 ae dc e0 00   5d+13:05:34.473  WRITE DMA EXT
  35 00 40 30 a9 dc e0 00   5d+13:05:34.472  WRITE DMA EXT
  35 00 40 f0 a3 dc e0 00   5d+13:05:34.471  WRITE DMA EXT

Error 73 occurred at disk power-on lifetime: 23799 hours (991 days + 15 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  84 51 b0 b0 6f 3c 0c  Error: ICRC, ABRT 176 sectors at LBA = 0x0c3c6fb0 = 205287344

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  35 00 40 20 6c 3c e0 00   5d+12:59:44.567  WRITE DMA EXT
  25 00 38 e8 83 3c e0 00   5d+12:59:44.567  READ DMA EXT
  25 00 40 a8 7e 3c e0 00   5d+12:59:44.559  READ DMA EXT
  25 00 40 68 79 3c e0 00   5d+12:59:44.553  READ DMA EXT
  25 00 40 28 74 3c e0 00   5d+12:59:44.546  READ DMA EXT

Error 72 occurred at disk power-on lifetime: 23799 hours (991 days + 15 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  84 51 f0 30 ea b0 04  Error: ICRC, ABRT 240 sectors at LBA = 0x04b0ea30 = 78703152

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  35 00 40 e0 e7 b0 e0 00   5d+12:52:15.417  WRITE DMA EXT
  35 00 40 a0 e2 b0 e0 00   5d+12:52:15.416  WRITE DMA EXT
  25 00 c0 98 84 b3 e0 00   5d+12:52:15.413  READ DMA EXT
  25 00 40 58 7f b3 e0 00   5d+12:52:15.395  READ DMA EXT
  25 00 a8 80 95 b0 e0 00   5d+12:52:15.391  READ DMA EXT

Error 71 occurred at disk power-on lifetime: 23751 hours (989 days + 15 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  84 51 21 0f 38 2f 0a  Error: ICRC, ABRT 33 sectors at LBA = 0x0a2f380f = 170866703

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  25 00 40 f0 36 2f e0 00   3d+13:18:27.976  READ DMA EXT
  25 00 40 b0 31 2f e0 00   3d+13:18:27.967  READ DMA EXT
  25 00 80 30 2e 2f e0 00   3d+13:18:27.963  READ DMA EXT
  25 00 40 f0 28 2f e0 00   3d+13:18:27.954  READ DMA EXT
  25 00 40 b0 25 2f e0 00   3d+13:18:27.950  READ DMA EXT

Error 70 occurred at disk power-on lifetime: 23751 hours (989 days + 15 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  84 51 c1 4f bb 5b 0f  Error: ICRC, ABRT 193 sectors at LBA = 0x0f5bbb4f = 257669967

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  25 00 40 d0 b6 5b e0 00   3d+12:36:00.990  READ DMA EXT
  25 00 c0 10 b6 5b e0 00   3d+12:36:00.989  READ DMA EXT
  25 00 40 d0 b0 5b e0 00   3d+12:36:00.982  READ DMA EXT
  25 00 c0 10 b0 5b e0 00   3d+12:36:00.981  READ DMA EXT
  25 00 40 d0 aa 5b e0 00   3d+12:36:00.973  READ DMA EXT

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Interrupted (host reset)      90%     23217         -
# 2  Short offline       Completed without error       00%     23216         -

SMART Selective self-test log data structure revision number 1
SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.