DaveHavok Posted December 7, 2016 Share Posted December 7, 2016 Hello everyone! I'm currently in the process of troubleshooting my unRAID server that's currently unprotected due to a missing drive. PROBLEM: - One of my drives (Drive 5) suddenly was having problems reading during a monthly parity check. - Drive 5 was marked with a red X and listed as "Faulty" - Trying a potential cheap fix, I swapped SATA cables - Drive 5 passed a few SMART checks - I then did a parity sync / rebuild of the same Drive 5 - Rebuild was successful (2 days later) - I then attempted a parity check and immediately Drive 5 was generating read errors - I immediately stopped the parity check before any writes to the parity drive could be done. - I swapped in a brand new drive in the same slot to begin preclearing the drive - I had to power on/off the server a few times to get it to see the drive in the preclear menu and in the device listings - The preclear is insanely slow at > 1MB /s Looking at the system log, I see this same error generating repeatedly: Dec 6 23:14:01 OrigamiNET emhttp: shcmd (426): rmmod md-mod |& logger Dec 6 23:14:01 OrigamiNET kernel: md: unRAID driver removed Dec 6 23:14:01 OrigamiNET emhttp: shcmd (427): modprobe md-mod super=/boot/config/super.dat |& logger Dec 6 23:14:01 OrigamiNET kernel: md: unRAID driver 2.6.8 installed Dec 6 23:14:01 OrigamiNET emhttp: err: get_key_info: get_message: /boot/config/._Pro.key (-3) Dec 6 23:14:01 OrigamiNET emhttp: Pro key detected, GUID: 03F0-5307-0000-0000000003F6 FILE: /boot/config/Pro.key Dec 6 23:14:01 OrigamiNET emhttp: Device inventory: Dec 6 23:14:01 OrigamiNET emhttp: shcmd (428): udevadm settle Dec 6 23:14:01 OrigamiNET emhttp: hp_v165w_00000000000003F6-0:0 (sda) 3946464 Dec 6 23:14:01 OrigamiNET emhttp: BP4_mSATA_SSD_FECA07411CEC00143435 (sdq) 117220792 Dec 6 23:14:01 OrigamiNET emhttp: ST3000DM001-1CH166_W1F572BT (sdb) 2930266532 Dec 6 23:14:01 OrigamiNET emhttp: ST8000AS0002-1NA17Z_Z840A6S4 (sdc) 7814026532 Dec 6 23:14:01 OrigamiNET emhttp: ST8000AS0002-1NA17Z_Z840BT8X (sdd) 7814026532 Dec 6 23:14:01 OrigamiNET emhttp: ST8000AS0002-1NA17Z_Z840JYK2 (sde) 7814026532 Dec 6 23:14:01 OrigamiNET emhttp: ST3000DM001-1ER166_Z5005CNJ (sdf) 2930266532 Dec 6 23:14:01 OrigamiNET emhttp: ST3000DM001-1CH166_W1F43J37 (sdg) 2930266532 Dec 6 23:14:01 OrigamiNET emhttp: ST3000DM001-1CH166_W1F42CT4 (sdh) 2930266532 Dec 6 23:14:01 OrigamiNET emhttp: ST3000DM001-1CH166_W1F29MZT (sdi) 2930266532 Dec 6 23:14:01 OrigamiNET emhttp: ST3000DM001-1CH166_W1F4TTZC (sdj) 2930266532 Dec 6 23:14:01 OrigamiNET emhttp: ST3000DM001-1ER166_Z500MXRB (sdk) 2930266532 Dec 6 23:14:01 OrigamiNET emhttp: ST3000DM001-1ER166_Z5005EVB (sdl) 2930266532 Dec 6 23:14:01 OrigamiNET emhttp: ST3000DM001-1ER166_Z5005FV5 (sdm) 2930266532 Dec 6 23:14:01 OrigamiNET emhttp: ST8000AS0002-1NA17Z_Z840SL8R (sdn) 7814026532 Dec 6 23:14:01 OrigamiNET emhttp: ST3000DM001-1CH166_W1F28VK2 (sdo) 2930266532 Dec 6 23:14:01 OrigamiNET emhttp: ST8000AS0002-1NA17Z_Z840A1MT (sdp) 7814026532 Dec 6 23:14:01 OrigamiNET kernel: mdcmd (1): import 0 sdp 7814026532 0 ST8000AS0002-1NA17Z_Z840A1MT Dec 6 23:14:01 OrigamiNET kernel: md: import disk0: (sdp) ST8000AS0002-1NA17Z_Z840A1MT size: 7814026532 Dec 6 23:14:01 OrigamiNET kernel: mdcmd (2): import 1 sdi 2930266532 0 ST3000DM001-1CH166_W1F29MZT Dec 6 23:14:01 OrigamiNET kernel: md: import disk1: (sdi) ST3000DM001-1CH166_W1F29MZT size: 2930266532 Dec 6 23:14:01 OrigamiNET kernel: mdcmd (3): import 2 sdh 2930266532 0 ST3000DM001-1CH166_W1F42CT4 Dec 6 23:14:01 OrigamiNET kernel: md: import disk2: (sdh) ST3000DM001-1CH166_W1F42CT4 size: 2930266532 Dec 6 23:14:01 OrigamiNET kernel: mdcmd (4): import 3 sdg 2930266532 0 ST3000DM001-1CH166_W1F43J37 Dec 6 23:14:01 OrigamiNET kernel: md: import disk3: (sdg) ST3000DM001-1CH166_W1F43J37 size: 2930266532 Dec 6 23:14:01 OrigamiNET kernel: mdcmd (5): import 4 sdf 2930266532 0 ST3000DM001-1ER166_Z5005CNJ Dec 6 23:14:01 OrigamiNET kernel: md: import disk4: (sdf) ST3000DM001-1ER166_Z5005CNJ size: 2930266532 Dec 6 23:14:01 OrigamiNET kernel: mdcmd (6): import 5 Dec 6 23:14:01 OrigamiNET kernel: md: import_slot: 5 empty Dec 6 23:14:01 OrigamiNET kernel: mdcmd (7): import 6 sdm 2930266532 0 ST3000DM001-1ER166_Z5005FV5 Dec 6 23:14:01 OrigamiNET kernel: md: import disk6: (sdm) ST3000DM001-1ER166_Z5005FV5 size: 2930266532 Dec 6 23:14:01 OrigamiNET kernel: mdcmd (: import 7 sdl 2930266532 0 ST3000DM001-1ER166_Z5005EVB Dec 6 23:14:01 OrigamiNET kernel: md: import disk7: (sdl) ST3000DM001-1ER166_Z5005EVB size: 2930266532 Dec 6 23:14:01 OrigamiNET kernel: mdcmd (9): import 8 sdk 2930266532 0 ST3000DM001-1ER166_Z500MXRB Dec 6 23:14:01 OrigamiNET kernel: md: import disk8: (sdk) ST3000DM001-1ER166_Z500MXRB size: 2930266532 Dec 6 23:14:01 OrigamiNET kernel: mdcmd (10): import 9 sdj 2930266532 0 ST3000DM001-1CH166_W1F4TTZC Dec 6 23:14:01 OrigamiNET kernel: md: import disk9: (sdj) ST3000DM001-1CH166_W1F4TTZC size: 2930266532 Dec 6 23:14:01 OrigamiNET kernel: mdcmd (11): import 10 sdb 2930266532 0 ST3000DM001-1CH166_W1F572BT Dec 6 23:14:01 OrigamiNET kernel: md: import disk10: (sdb) ST3000DM001-1CH166_W1F572BT size: 2930266532 Dec 6 23:14:01 OrigamiNET kernel: mdcmd (12): import 11 sdo 2930266532 0 ST3000DM001-1CH166_W1F28VK2 Dec 6 23:14:01 OrigamiNET kernel: md: import disk11: (sdo) ST3000DM001-1CH166_W1F28VK2 size: 2930266532 Dec 6 23:14:01 OrigamiNET kernel: mdcmd (13): import 12 sdc 7814026532 0 ST8000AS0002-1NA17Z_Z840A6S4 Dec 6 23:14:01 OrigamiNET kernel: md: import disk12: (sdc) ST8000AS0002-1NA17Z_Z840A6S4 size: 7814026532 Dec 6 23:14:01 OrigamiNET kernel: mdcmd (14): import 13 sdd 7814026532 0 ST8000AS0002-1NA17Z_Z840BT8X Dec 6 23:14:01 OrigamiNET kernel: md: import disk13: (sdd) ST8000AS0002-1NA17Z_Z840BT8X size: 7814026532 Dec 6 23:14:01 OrigamiNET kernel: mdcmd (15): import 14 sde 7814026532 0 ST8000AS0002-1NA17Z_Z840JYK2 Dec 6 23:14:01 OrigamiNET kernel: md: import disk14: (sde) ST8000AS0002-1NA17Z_Z840JYK2 size: 7814026532 Dec 6 23:14:01 OrigamiNET kernel: mdcmd (16): import 15 Dec 6 23:14:01 OrigamiNET kernel: mdcmd (17): import 16 Dec 6 23:14:01 OrigamiNET kernel: mdcmd (18): import 17 Dec 6 23:14:01 OrigamiNET kernel: mdcmd (19): import 18 Dec 6 23:14:01 OrigamiNET kernel: mdcmd (20): import 19 Dec 6 23:14:01 OrigamiNET kernel: mdcmd (21): import 20 Dec 6 23:14:01 OrigamiNET kernel: mdcmd (22): import 21 Dec 6 23:14:01 OrigamiNET kernel: mdcmd (23): import 22 Dec 6 23:14:01 OrigamiNET kernel: mdcmd (24): import 23 Dec 6 23:14:01 OrigamiNET kernel: mdcmd (25): import 24 Dec 6 23:14:01 OrigamiNET kernel: mdcmd (26): import 25 Dec 6 23:14:01 OrigamiNET kernel: mdcmd (27): import 26 Dec 6 23:14:01 OrigamiNET kernel: mdcmd (28): import 27 Dec 6 23:14:01 OrigamiNET kernel: mdcmd (29): import 28 Dec 6 23:14:01 OrigamiNET kernel: mdcmd (30): import 29 Dec 6 23:14:01 OrigamiNET kernel: md: import_slot: 29 empty Dec 6 23:14:01 OrigamiNET emhttp: import 30 cache device: sdq Dec 6 23:14:01 OrigamiNET emhttp: import flash device: sda Dec 6 23:14:29 OrigamiNET kernel: ata5.00: exception Emask 0x40 SAct 0x800000 SErr 0x880800 action 0x6 frozen Dec 6 23:14:29 OrigamiNET kernel: ata5: SError: { HostInt 10B8B LinkSeq } Dec 6 23:14:29 OrigamiNET kernel: ata5.00: failed command: READ FPDMA QUEUED Dec 6 23:14:29 OrigamiNET kernel: ata5.00: cmd 60/00:b8:20:76:10/01:00:00:00:00/40 tag 23 ncq 131072 in Dec 6 23:14:29 OrigamiNET kernel: res 40/00:c0:20:14:10/00:00:00:00:00/40 Emask 0x44 (timeout) Dec 6 23:14:29 OrigamiNET kernel: ata5.00: status: { DRDY } Dec 6 23:14:29 OrigamiNET kernel: ata5: hard resetting link Dec 6 23:14:29 OrigamiNET kernel: ata5: SATA link up 1.5 Gbps (SStatus 113 SControl 310) Dec 6 23:14:29 OrigamiNET kernel: ACPI Error: [DSSP] Namespace lookup failure, AE_NOT_FOUND (20150930/psargs-359) Dec 6 23:14:29 OrigamiNET kernel: ACPI Error: Method parse/execution failed [\_SB.PCI0.SAT1.SPT0._GTF] (Node ffff88082f523a50), AE_NOT_FOUND (20150930/psparse-542) Dec 6 23:14:29 OrigamiNET kernel: ACPI Error: [DSSP] Namespace lookup failure, AE_NOT_FOUND (20150930/psargs-359) Dec 6 23:14:29 OrigamiNET kernel: ACPI Error: Method parse/execution failed [\_SB.PCI0.SAT1.SPT0._GTF] (Node ffff88082f523a50), AE_NOT_FOUND (20150930/psparse-542) Dec 6 23:14:29 OrigamiNET kernel: ata5.00: configured for UDMA/33 Dec 6 23:14:29 OrigamiNET kernel: ata5: EH complete Dec 6 23:14:29 OrigamiNET kernel: ata5.00: exception Emask 0x50 SAct 0x20 SErr 0x280900 action 0x6 frozen Dec 6 23:14:29 OrigamiNET kernel: ata5.00: irq_stat 0x08000000, interface fatal error Dec 6 23:14:29 OrigamiNET kernel: ata5: SError: { UnrecovData HostInt 10B8B BadCRC } Dec 6 23:14:29 OrigamiNET kernel: ata5.00: failed command: READ FPDMA QUEUED Dec 6 23:14:29 OrigamiNET kernel: ata5.00: cmd 60/00:28:20:76:10/01:00:00:00:00/40 tag 5 ncq 131072 in Dec 6 23:14:29 OrigamiNET kernel: res 40/00:28:20:76:10/00:00:00:00:00/40 Emask 0x50 (ATA bus error) Dec 6 23:14:29 OrigamiNET kernel: ata5.00: status: { DRDY } Dec 6 23:14:29 OrigamiNET kernel: ata5: hard resetting link Dec 6 23:14:29 OrigamiNET kernel: ata5: SATA link up 1.5 Gbps (SStatus 113 SControl 310) Dec 6 23:14:29 OrigamiNET kernel: ACPI Error: [DSSP] Namespace lookup failure, AE_NOT_FOUND (20150930/psargs-359) Dec 6 23:14:29 OrigamiNET kernel: ACPI Error: Method parse/execution failed [\_SB.PCI0.SAT1.SPT0._GTF] (Node ffff88082f523a50), AE_NOT_FOUND (20150930/psparse-542) Dec 6 23:14:29 OrigamiNET kernel: ACPI Error: [DSSP] Namespace lookup failure, AE_NOT_FOUND (20150930/psargs-359) Dec 6 23:14:29 OrigamiNET kernel: ACPI Error: Method parse/execution failed [\_SB.PCI0.SAT1.SPT0._GTF] (Node ffff88082f523a50), AE_NOT_FOUND (20150930/psparse-542) Dec 6 23:14:29 OrigamiNET kernel: ata5.00: configured for UDMA/33 Dec 6 23:14:29 OrigamiNET kernel: ata5: EH complete Dec 6 23:14:29 OrigamiNET kernel: ata5.00: exception Emask 0x50 SAct 0x60000000 SErr 0x280900 action 0x6 frozen Dec 6 23:14:29 OrigamiNET kernel: ata5.00: irq_stat 0x08000000, interface fatal error Dec 6 23:14:29 OrigamiNET kernel: ata5: SError: { UnrecovData HostInt 10B8B BadCRC } Dec 6 23:14:29 OrigamiNET kernel: ata5.00: failed command: READ FPDMA QUEUED Dec 6 23:14:29 OrigamiNET kernel: ata5.00: cmd 60/00:e8:20:7a:10/01:00:00:00:00/40 tag 29 ncq 131072 in Dec 6 23:14:29 OrigamiNET kernel: res 40/00:e8:20:7a:10/00:00:00:00:00/40 Emask 0x50 (ATA bus error) Dec 6 23:14:29 OrigamiNET kernel: ata5.00: status: { DRDY } Dec 6 23:14:29 OrigamiNET kernel: ata5.00: failed command: READ FPDMA QUEUED Dec 6 23:14:29 OrigamiNET kernel: ata5.00: cmd 60/00:f0:20:7b:10/01:00:00:00:00/40 tag 30 ncq 131072 in Dec 6 23:14:29 OrigamiNET kernel: res 40/00:e8:20:7a:10/00:00:00:00:00/40 Emask 0x50 (ATA bus error) Dec 6 23:14:29 OrigamiNET kernel: ata5.00: status: { DRDY } Dec 6 23:14:29 OrigamiNET kernel: ata5: hard resetting link Dec 6 23:14:30 OrigamiNET kernel: ata5: SATA link up 1.5 Gbps (SStatus 113 SControl 310) Dec 6 23:14:30 OrigamiNET kernel: ACPI Error: [DSSP] Namespace lookup failure, AE_NOT_FOUND (20150930/psargs-359) Dec 6 23:14:30 OrigamiNET kernel: ACPI Error: Method parse/execution failed [\_SB.PCI0.SAT1.SPT0._GTF] (Node ffff88082f523a50), AE_NOT_FOUND (20150930/psparse-542) Dec 6 23:14:30 OrigamiNET kernel: ACPI Error: [DSSP] Namespace lookup failure, AE_NOT_FOUND (20150930/psargs-359) Dec 6 23:14:30 OrigamiNET kernel: ACPI Error: Method parse/execution failed [\_SB.PCI0.SAT1.SPT0._GTF] (Node ffff88082f523a50), AE_NOT_FOUND (20150930/psparse-542) Dec 6 23:14:30 OrigamiNET kernel: ata5.00: configured for UDMA/33 Dec 6 23:14:30 OrigamiNET kernel: ata5: EH complete Dec 6 23:14:30 OrigamiNET kernel: ata5.00: exception Emask 0x50 SAct 0xc000 SErr 0x280900 action 0x6 frozen Dec 6 23:14:30 OrigamiNET kernel: ata5.00: irq_stat 0x08000000, interface fatal error Dec 6 23:14:30 OrigamiNET kernel: ata5: SError: { UnrecovData HostInt 10B8B BadCRC } Dec 6 23:14:30 OrigamiNET kernel: ata5.00: failed command: READ FPDMA QUEUED Dec 6 23:14:30 OrigamiNET kernel: ata5.00: cmd 60/00:70:20:7d:10/01:00:00:00:00/40 tag 14 ncq 131072 in Dec 6 23:14:30 OrigamiNET kernel: res 40/00:70:20:7d:10/00:00:00:00:00/40 Emask 0x50 (ATA bus error) Dec 6 23:14:30 OrigamiNET kernel: ata5.00: status: { DRDY } Dec 6 23:14:30 OrigamiNET kernel: ata5.00: failed command: READ FPDMA QUEUED Dec 6 23:14:30 OrigamiNET kernel: ata5.00: cmd 60/00:78:20:7e:10/01:00:00:00:00/40 tag 15 ncq 131072 in Dec 6 23:14:30 OrigamiNET kernel: res 40/00:70:20:7d:10/00:00:00:00:00/40 Emask 0x50 (ATA bus error) Dec 6 23:14:30 OrigamiNET kernel: ata5.00: status: { DRDY } Dec 6 23:14:30 OrigamiNET kernel: ata5: hard resetting link Dec 6 23:14:30 OrigamiNET kernel: ata5: SATA link up 1.5 Gbps (SStatus 113 SControl 310) Dec 6 23:14:30 OrigamiNET kernel: ACPI Error: [DSSP] Namespace lookup failure, AE_NOT_FOUND (20150930/psargs-359) Dec 6 23:14:30 OrigamiNET kernel: ACPI Error: Method parse/execution failed [\_SB.PCI0.SAT1.SPT0._GTF] (Node ffff88082f523a50), AE_NOT_FOUND (20150930/psparse-542) Dec 6 23:14:30 OrigamiNET kernel: ACPI Error: [DSSP] Namespace lookup failure, AE_NOT_FOUND (20150930/psargs-359) Dec 6 23:14:30 OrigamiNET kernel: ACPI Error: Method parse/execution failed [\_SB.PCI0.SAT1.SPT0._GTF] (Node ffff88082f523a50), AE_NOT_FOUND (20150930/psparse-542) Dec 6 23:14:30 OrigamiNET kernel: ata5.00: configured for UDMA/33 Dec 6 23:14:30 OrigamiNET kernel: ata5: EH complete Dec 6 23:14:30 OrigamiNET kernel: ata5.00: exception Emask 0x10 SAct 0x40000001 SErr 0x280100 action 0x6 frozen Dec 6 23:14:30 OrigamiNET kernel: ata5.00: irq_stat 0x08000000, interface fatal error Dec 6 23:14:30 OrigamiNET kernel: ata5: SError: { UnrecovData 10B8B BadCRC } Dec 6 23:14:30 OrigamiNET kernel: ata5.00: failed command: READ FPDMA QUEUED Dec 6 23:14:30 OrigamiNET kernel: ata5.00: cmd 60/00:00:20:81:10/01:00:00:00:00/40 tag 0 ncq 131072 in Dec 6 23:14:30 OrigamiNET kernel: res 40/00:f0:20:80:10/00:00:00:00:00/40 Emask 0x10 (ATA bus error) Dec 6 23:14:30 OrigamiNET kernel: ata5.00: status: { DRDY } Dec 6 23:14:30 OrigamiNET kernel: ata5.00: failed command: READ FPDMA QUEUED Dec 6 23:14:30 OrigamiNET kernel: ata5.00: cmd 60/00:f0:20:80:10/01:00:00:00:00/40 tag 30 ncq 131072 in Dec 6 23:14:30 OrigamiNET kernel: res 40/00:f0:20:80:10/00:00:00:00:00/40 Emask 0x10 (ATA bus error) Dec 6 23:14:30 OrigamiNET kernel: ata5.00: status: { DRDY } Dec 6 23:14:30 OrigamiNET kernel: ata5: hard resetting link Dec 6 23:14:30 OrigamiNET kernel: ata5: SATA link up 1.5 Gbps (SStatus 113 SControl 310) Dec 6 23:14:30 OrigamiNET kernel: ACPI Error: [DSSP] Namespace lookup failure, AE_NOT_FOUND (20150930/psargs-359) Dec 6 23:14:30 OrigamiNET kernel: ACPI Error: Method parse/execution failed [\_SB.PCI0.SAT1.SPT0._GTF] (Node ffff88082f523a50), AE_NOT_FOUND (20150930/psparse-542) Dec 6 23:14:30 OrigamiNET kernel: ACPI Error: [DSSP] Namespace lookup failure, AE_NOT_FOUND (20150930/psargs-359) Dec 6 23:14:30 OrigamiNET kernel: ACPI Error: Method parse/execution failed [\_SB.PCI0.SAT1.SPT0._GTF] (Node ffff88082f523a50), AE_NOT_FOUND (20150930/psparse-542) Dec 6 23:14:30 OrigamiNET kernel: ata5.00: configured for UDMA/33 Dec 6 23:14:30 OrigamiNET kernel: ata5: EH complete THOUGHTS: - SATA cable was swapped and is brand new, so I'm ruling that out. - Bad Port in the IcyDock? - Bad Port on the Motherboard? - Could my RAID card be going faulty? - I'm thinking the original Drive 5 was OK and that the Port itself might be an issue since both the new and the old drive appear to be having reading problems - This line obviously jumps out at me: Dec 6 23:14:30 OrigamiNET kernel: ata5.00: exception Emask 0x10 SAct 0x40000001 SErr 0x280100 action 0x6 frozen Dec 6 23:14:30 OrigamiNET kernel: ata5.00: irq_stat 0x08000000, interface fatal error Any help would be much appreciated! Link to comment
JorgeB Posted December 7, 2016 Share Posted December 7, 2016 Dec 6 23:14:30 OrigamiNET kernel: ata5: SError: { UnrecovData 10B8B BadCRC } 99% of the time these mean a bad SATA cable, but if you already replace it it can be a bad SATA port. Link to comment
DaveHavok Posted December 7, 2016 Author Share Posted December 7, 2016 Dec 6 23:14:30 OrigamiNET kernel: ata5: SError: { UnrecovData 10B8B BadCRC } 99% of the time these mean a bad SATA cable, but if you already replace it it can be a bad SATA port. Yeah, I'm looking through the Drive Analysis doc now (https://lime-technology.com/wiki/index.php/The_Analysis_of_Drive_Issues) Looks like a combo of Drive Interface Issues 1 and 2. I'm also wondering if maybe my Power Supply is too weak. 750watts. Hmmm UPDATE: - After running through some PSU calculators, I'm good with the 750watt size. (http://www.coolermaster.com/power-supply-calculator/) - I'll probably replace all the SATA cables since I suspect them of being very poor quality and they don't lock. - Research potential problems with the SUPERMICRO AOC-SASLP-MV8 controller (Is there a popular replacement that's faster / reliable?) Link to comment
TSM Posted December 7, 2016 Share Posted December 7, 2016 I don't know about the rest of your issues, but I replaced my AOC-SASLP-MV8 with a AOC-SAS2LP-MV8, and was very happy with the decision. I never did figure out what the deal was with my old motherboard and that card. Different various performance issues, some that have been well documented by other forum members in other forum postings, 1 or 2 that may have been unique to me. But when I finally gave up and replaced it with the AOC-SAS2LP-MV8, it was like a breath of fresh air for the system. Link to comment
DaveHavok Posted December 7, 2016 Author Share Posted December 7, 2016 I don't know about the rest of your issues, but I replaced my AOC-SASLP-MV8 with a AOC-SAS2LP-MV8, and was very happy with the decision. I never did figure out what the deal was with my old motherboard and that card. Different various performance issues, some that have been well documented by other forum members in other forum postings, 1 or 2 that may have been unique to me. But when I finally gave up and replaced it with the AOC-SAS2LP-MV8, it was like a breath of fresh air for the system. Thank you for the feedback on this! I'm feeling like this is where the speed bottleneck is currently in my system. I do wish I did some more research before having the knee jerk reaction of "Crap! Order more drives!, but hey, I needed to grow the array anyways. Now to track down some good SATA cables to swap. SFF-8087 mini-SAS cables also. Link to comment
JorgeB Posted December 7, 2016 Share Posted December 7, 2016 - Research potential problems with the SUPERMICRO AOC-SASLP-MV8 controller (Is there a popular replacement that's faster / reliable?) The SASLP is one of the most used controllers in unRAID, I'm not aware of any issues with it, it is however somewhat bandwidth limited, fully loaded max speed during parity check/disk rebuild is 80MB/s. Link to comment
DaveHavok Posted December 8, 2016 Author Share Posted December 8, 2016 Got a AOC-SAS2LP-MV8 and all new SATA cabling on it's way in the mail. I'll follow up post testing! Thanks everyone! Link to comment
DaveHavok Posted December 11, 2016 Author Share Posted December 11, 2016 UPDATE: - Replaced my AOC-SASLP-MV8 with a AOC-SAS2LP-MV8 - Replaced all cables The issue with Drive 5 continues. The drive itself appears to be fine, but it intermittently appears and disappears from the BIOS hardware listing when rebooting. - Move the SATA cable for Drive 5 to a different port. No change and the issue continues with intermittently detecting the drive. - Swapped drives out just to humor myself and the same behavior continues with the swapped drive. At this point, I suspect that the Icy Dock bay itself is going bad. Bypassing the Icy Dock bay and doing a direct connection to the drive would confirm that. Just seems so odd that the Icy Dock itself is starting to go bad. Maybe I should just replace the entire case and forgo the Icy Dock bays all together and get something that allows for easier motherboard access for cable management. Back at it again. Sigh. Link to comment
DaveHavok Posted December 11, 2016 Author Share Posted December 11, 2016 Well that sucks Looks like the Icy Dock MB455SPF-B is no longer manufactured, and the few units for sale are at super mark up prices. Looks like I'm in the market for a new case that can hold 15+ drives and has good cable management. Any suggestions or recommendations? UPDATE: I might just swap out the defective dock for the newer version: Icy Dock FatCage MB155SP-B Link to comment
itimpi Posted December 11, 2016 Share Posted December 11, 2016 Well that sucks Looks like the Icy Dock MB455SPF-B is no longer manufactured, and the few units for sale are at super mark up prices. Looks like I'm in the market for a new case that can hold 15+ drives and has good cable management. Any suggestions or recommendations? UPDATE: I might just swap out the defective dock for the newer version: Icy Dock FatCage MB155SP-B I have been using the Icy Dock FatCage MB155SP-B without any issues. Link to comment
DaveHavok Posted December 11, 2016 Author Share Posted December 11, 2016 Quick question - Could dust or a slightly loose connection cause this error: Dec 11 13:36:00 OrigamiNET kernel: ata8.00: supports DRM functions and may not be fully accessible Dec 11 13:36:00 OrigamiNET kernel: ata8.00: configured for UDMA/33 Dec 11 13:36:00 OrigamiNET kernel: ata8: EH complete Dec 11 13:36:33 OrigamiNET kernel: ata8.00: exception Emask 0x0 SAct 0x0 SErr 0x880000 action 0x6 frozen Dec 11 13:36:33 OrigamiNET kernel: ata8: SError: { 10B8B LinkSeq } Dec 11 13:36:33 OrigamiNET kernel: ata8.00: failed command: WRITE DMA EXT Dec 11 13:36:33 OrigamiNET kernel: ata8.00: cmd 35/00:40:a0:a0:97/00:05:a4:00:00/e0 tag 2 dma 688128 out Dec 11 13:36:33 OrigamiNET kernel: res 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) Dec 11 13:36:33 OrigamiNET kernel: ata8.00: status: { DRDY } Dec 11 13:36:33 OrigamiNET kernel: ata8: hard resetting link Dec 11 13:36:34 OrigamiNET kernel: ata8: SATA link up 1.5 Gbps (SStatus 113 SControl 310) I'm just seeing the SError: { 10B8B LinkSeq } issue now instead of the previous { UnrecovData HostInt 10B8B BadCRC } UPDATE: Attaching full Sys Log incase I'm missing something origaminet-syslog-20161211-1355.zip Link to comment
John_M Posted December 12, 2016 Share Posted December 12, 2016 It looks as though it could be cable related, but it also looks a little like this: http://lime-technology.com/forum/index.php?topic=40683.0 Do you have IOMMU enabled? Post your diagnostics zip. Link to comment
DaveHavok Posted December 12, 2016 Author Share Posted December 12, 2016 Thanks for taking a look John. I just finished the Disk Rebuild on the drive and the array is back up and running again in protected state. -The next step is to do a Parity Check with the "Make Corrections to Parity Drive" unchecked. -Fix the ACPI Exception with this http://lime-technology.com/forum/index.php?topic=45920.0 However, I noticed in the Sys Log that the reported errors went from ATA8 to ATA7 about half way through the rebuild process. The IOMMU is a new one on me and will have to review the provided thread and follow up with you. origaminet-diagnostics-20161211-1856.zip Link to comment
DaveHavok Posted December 12, 2016 Author Share Posted December 12, 2016 It looks as though it could be cable related, but it also looks a little like this: http://lime-technology.com/forum/index.php?topic=40683.0 Do you have IOMMU enabled? Post your diagnostics zip. I'm not really seeing any of the errors that's mentioned in that thread, well nothing that's a solid match for my problem. I'm running the newest firmware for the card also: 4.0.0.1812 UPDATE: I take that back John, I do believe I am seeing what you're taking about with the IOMMU and Marvell chipset cards. So far, I'm just seeing this same error going back and forth between ATA7 and ATA8. They switch out for hours at a time on being reported. - Finished another extended SMARTCheck on the drive. Passed - Parity Check completed and No Errors Found (Array is up and Protected) However, I'm still seeing a few of these popping up every now and then: Dec 12 06:14:01 OrigamiNET kernel: ata8.00: exception Emask 0x10 SAct 0x0 SErr 0x280100 action 0x6 frozen Dec 12 06:14:01 OrigamiNET kernel: ata8.00: irq_stat 0x08000002, interface fatal error Dec 12 06:14:01 OrigamiNET kernel: ata8: SError: { UnrecovData 10B8B BadCRC } Dec 12 06:14:01 OrigamiNET kernel: ata8.00: failed command: SMART Dec 12 06:14:01 OrigamiNET kernel: ata8.00: cmd b0/d1:01:01:4f:c2/00:00:00:00:00/00 tag 4 pio 512 in Dec 12 06:14:01 OrigamiNET kernel: res 50/00:ff:00:00:00/00:00:00:00:00/40 Emask 0x10 (ATA bus error) Dec 12 06:14:01 OrigamiNET kernel: ata8.00: status: { DRDY } Dec 12 06:14:01 OrigamiNET kernel: ata8: hard resetting link Dec 12 06:14:02 OrigamiNET kernel: ata8: SATA link up 1.5 Gbps (SStatus 113 SControl 310) Dec 12 06:14:02 OrigamiNET kernel: ACPI Error: [DSSP] Namespace lookup failure, AE_NOT_FOUND (20150930/psargs-359) Dec 12 06:14:02 OrigamiNET kernel: ACPI Error: Method parse/execution failed [\_SB.PCI0.SAT1.SPT3._GTF] (Node ffff88082f523bb8), AE_NOT_FOUND (20150930/psparse-542) Dec 12 06:14:02 OrigamiNET kernel: ata8.00: supports DRM functions and may not be fully accessible Dec 12 06:14:02 OrigamiNET kernel: ACPI Error: [DSSP] Namespace lookup failure, AE_NOT_FOUND (20150930/psargs-359) Dec 12 06:14:02 OrigamiNET kernel: ACPI Error: Method parse/execution failed [\_SB.PCI0.SAT1.SPT3._GTF] (Node ffff88082f523bb8), AE_NOT_FOUND (20150930/psparse-542) Dec 12 06:14:02 OrigamiNET kernel: ata8.00: supports DRM functions and may not be fully accessible Dec 12 06:14:02 OrigamiNET kernel: ata8.00: configured for UDMA/33 Dec 12 06:14:02 OrigamiNET kernel: ata8: EH complete Dec 12 02:00:47 OrigamiNET kernel: ata7.00: exception Emask 0x10 SAct 0x0 SErr 0x280100 action 0x6 frozen Dec 12 02:00:47 OrigamiNET kernel: ata7.00: irq_stat 0x08000000, interface fatal error Dec 12 02:00:47 OrigamiNET kernel: ata7: SError: { UnrecovData 10B8B BadCRC } Dec 12 02:00:47 OrigamiNET kernel: ata7.00: failed command: READ DMA EXT Dec 12 02:00:47 OrigamiNET kernel: ata7.00: cmd 25/00:40:00:35:33/00:05:c0:00:00/e0 tag 0 dma 688128 in Dec 12 02:00:47 OrigamiNET kernel: res 50/00:00:37:62:16/00:00:75:00:00/e0 Emask 0x10 (ATA bus error) Dec 12 02:00:47 OrigamiNET kernel: ata7.00: status: { DRDY } Dec 12 02:00:47 OrigamiNET kernel: ata7: hard resetting link Dec 12 02:00:48 OrigamiNET kernel: ata7: SATA link up 6.0 Gbps (SStatus 133 SControl 300) Dec 12 02:00:48 OrigamiNET kernel: ACPI Error: [DSSP] Namespace lookup failure, AE_NOT_FOUND (20150930/psargs-359) Dec 12 02:00:48 OrigamiNET kernel: ACPI Error: Method parse/execution failed [\_SB.PCI0.SAT1.SPT2._GTF] (Node ffff88082f523b40), AE_NOT_FOUND (20150930/psparse-542) Dec 12 02:00:49 OrigamiNET kernel: ACPI Error: [DSSP] Namespace lookup failure, AE_NOT_FOUND (20150930/psargs-359) Dec 12 02:00:49 OrigamiNET kernel: ACPI Error: Method parse/execution failed [\_SB.PCI0.SAT1.SPT2._GTF] (Node ffff88082f523b40), AE_NOT_FOUND (20150930/psparse-542) Dec 12 02:00:49 OrigamiNET kernel: ata7.00: configured for UDMA/133 Dec 12 02:00:49 OrigamiNET kernel: ata7: EH complete Link to comment
John_M Posted December 12, 2016 Share Posted December 12, 2016 I don't know enough about the "Marvell bug" to know whether you are actually affected by it or not. An easy way to test though is to disable IOMMU (a.k.a. Intel VT-d or AMD-Vi) in your BIOS, if you have it enabled. If you're not passing through hardware devices to VMs you don't need to have it enabled. I only use a couple of simple VMs with no pass-through so I live with it disabled. It's worth a try, at any rate. EDIT: I downloaded your diagnostics and then got distracted before I could take a look. The distraction took all day, unfortunately! Now I'd had a chance to look, I see that you do indeed have IOMMU enabled. If you can disable it in the BIOS (you might have to tell VMs not to auto-start, first) and then see if the errors in your syslog go away you'll be able to confirm one way or the other. Link to comment
John_M Posted December 12, 2016 Share Posted December 12, 2016 FWIW, here's my own experience of problems seemingly caused by a combination of a Marvell-based SAS card (the AOC-SAS2LP-MV8, same as yours) and IOMMU: http://lime-technology.com/forum/index.php?topic=38359.msg519891#msg519891 The symptoms were different to yours, but bore some resemblance to those experienced by the OP of that thread. Since disabling IOMMU it has been perfectly fine. Link to comment
DaveHavok Posted December 13, 2016 Author Share Posted December 13, 2016 Well I'll be damned! No errors since the change! It's been about 9 hours with nothing weird at all! I'm heading to bed to give it some more time before declaring all good, but I'm just surprised that it was just that single BIOS adjustment to fix this! Thanks for the second set of eyes on this John! Much appreciated. Link to comment
John_M Posted December 13, 2016 Share Posted December 13, 2016 Thanks for trying that, Dave. The question now is, can you live without IOMMU or is that an inconvenience to you because you want to run more sophisticated VMs and need to pass through hardware devices? If you do need IOMMU then there's a workaround mentioned in that thread, which may or may not work. If it doesn't work then the only solution is to use a different SAS controller, which would be annoying for you since I know you only just bought your current one. Personally, I have no need for pass-through - if I need a computer for a particular purpose I build one of the appropriate spec. I use only very simple VMs so the bug isn't a real problem for me. Now that it's stable, please use your server as you would expect to and report back if there are any issues. Link to comment
DaveHavok Posted December 16, 2016 Author Share Posted December 16, 2016 Thanks for trying that, Dave. The question now is, can you live without IOMMU or is that an inconvenience to you because you want to run more sophisticated VMs and need to pass through hardware devices? If you do need IOMMU then there's a workaround mentioned in that thread, which may or may not work. If it doesn't work then the only solution is to use a different SAS controller, which would be annoying for you since I know you only just bought your current one. Personally, I have no need for pass-through - if I need a computer for a particular purpose I build one of the appropriate spec. I use only very simple VMs so the bug isn't a real problem for me. Now that it's stable, please use your server as you would expect to and report back if there are any issues. Hi John! Looks like I'm still good to go. No errors at all. As for the IOMMU, I don't currently have a need for it as this server is purely running the Dockers I have listed in my sig. However, if unRAID gave me the ability to setup a HyperSpin Docker instance, I could see the potential need for hardware pass-through for game system emulators, and reading the USB slots for Bluetooth dongles and game controllers. Thanks again! Link to comment
DaveHavok Posted December 21, 2016 Author Share Posted December 21, 2016 And I spoke too soon... sigh. After some extensive hardware testing, I've confirmed that the backplate is the issue here as it's having problems reading the drive from time to time. Bypassing the backplate confirms this issue. Unfortunately, the Icy Dock HDD cages are a pain in the ass to track down right now... some sort of shortage it looks like as the pricing is very inflated on the MB155SP-B. In the mean time, is it OK to plug the drive into an external SATA enclosure so I can keep the array protected until I can find a replacement HDD cage? Basically putting the drive back into the array using an external SATA enclosure since the bay that the drive was in is bad. Thanks! Link to comment
Recommended Posts
Archived
This topic is now archived and is closed to further replies.