Juniper Posted June 12, 2020 Posted June 12, 2020 (edited) Hardware config: ASUS P8Z68 Deluxe with 8 GB RAM, 2 port Marvell 9128 SATA 3, 2 port Intel SATA 3, 4 port Intel SATA 2 Intel I7-2600K (not OCed) Drives: Cache: 2 Samsung 860 EVO 1 TB (connected to Marvell 9128) Parity: Seagate 8 TB (connected to Intel SATA 3 port) Data: 4 Seagate 8 TB, 1 Seagate 3 TB Flash: Sandisk Cruzer Glide 15.4 GB Power: Rosewill 1000 watt power supply, 3 hard drives on each power strip I'm combining my Win 10 drives into an Unraid NAS. Cleared out 2 disks to start, then copied over / added the other disks to the array. During the copying (roughly 16 TB total) I often ran into errors in the log saying: kernel: ata19.00 exception Emask 0x0 SAct 0x800000 SErr 0x0 action 0x6 frozen kernel: ata19.00 failed command WRITE FPDMA QUEUED kernel: ata19.00 cmd 61/08:98:18:7b:20/00:00:0b:00:00/40 tag 19 ncq dma 4096 out kernel: res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) kernel: ata19.00 status: { DRDY } kernel: ata19: hard resetting link kernel: ata19: SATA link up 6.0 Gbps (SStatus 133 SControl 300) kernel: ata19.00: supports DRM functions and may not be fully accessible kernel: ata19.00: supports DRM functions and may not be fully accessible kernel: ata19.00: configured for UDMA/133 kernel: ata19: EH complete kernel: ata19.00: Enabling discard_zeroes_data Several instances of the above sequence were noted in both cache ssd's logs. None of the other drives had errors in their logs. I searched the forum and found other people with similar error messages. Recommendation was to change cables. I changed SATA cables to cables that have worked on other drives, and ran a test (copying 800+ GB to a new test share that uses the cache. First let it copy for 1h, then start the mover): same errors. I changed SATA cables to new fresh cables: test showed same errors Added a new power cable to my powersupply, and moved the 2 cache SSDs to it. Nothing else is on it: test showed same errors Bought new fresh molex to sata adapters and used them for the SSD drives: test showed same errors The last piece it could be was ... the SATA controller: the cache drives were connected to the Marvell 9128 on my motherboard. When I searched the forums for Marvell I found all kinds of warnings and yes, many folks had complained about similar errors with controllers from Marvell. I checked the forum and the 2 port Asmedia controller I still had lying around was supported. I added it to the array, connected the cache drives to the intel sata 2 controller and moved 2 data drives to the Asmedia controller, and disabled the motherboard Marvell controller in the BIOS: Yeaih, test showed no errors! I copied and moved all day today and had no errors at all. My question is: Could the errors have been caused by the motherboard's Marvell 9128 controller? Or are maybe the SSDs bad? Thank you so much for reading all of this! Apologies for the wall of text. Edited June 13, 2020 by Juniper Quote
JorgeB Posted June 13, 2020 Posted June 13, 2020 Please post the diagnostics: Tools -> Diagnostics Quote
Juniper Posted June 13, 2020 Author Posted June 13, 2020 (edited) Thank you much for responding. Here are the diagnostics. The system is running just the array. No VM or anything else. Just the community plugin in case I want to play with docker sometime later. schiethucken-diagnostics-20200613-1158.zip Edited June 13, 2020 by Juniper Quote
Juniper Posted June 13, 2020 Author Posted June 13, 2020 I checked and saw my return window closes today, so I brought the SSDs back to Best Buy. They might be fine, but after 12 days of error messages it felt better to run it safe. Will buy new ones. Thank you everybody who read my post, and a big special Thank you to johnnie.black for responding and willingness to wade through my diagnostics :) Quote
JorgeB Posted June 14, 2020 Posted June 14, 2020 Diags are after rebooting so not much to see, if you need more help in the future remember we need them before rebooting. Quote
Juniper Posted June 14, 2020 Author Posted June 14, 2020 (edited) very sorry about this: each time I made a hardware change the system overwrote its logs. Noob here: I didn't know that I could generate logs with all system info. Still learning about Unraid. Next time I'll make diagnostics each time before hardware changes, and each time before shutting system down after errors, and save them somewhere. Thank you very much for having taken the time to help me! Edited June 14, 2020 by Juniper Quote
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.