January 11, 20179 yr I installed dynamics TRIM plugin yesterday which was scheduled to start at 05:00am. During the same time a parity check was running on my array. When I looked in the syslog when I got home from work I see a lot of disk errors from this morning. Does anyone know what this might be about, should I be concerned? I really don't know if the errors are due to running the parity check or the TRIM plugin, but i's about the same time as the TRIM is scheduled to run. the ata5 drive is my chache drive which is the only SSD disk in my server. ------- Jan 11 05:02:51 mandarin kernel: ata5: hard resetting link Jan 11 05:02:52 mandarin kernel: ata5: SATA link up 3.0 Gbps (SStatus 123 SControl 300) Jan 11 05:02:52 mandarin kernel: ata5.00: supports DRM functions and may not be fully accessible Jan 11 05:02:52 mandarin kernel: ata5.00: supports DRM functions and may not be fully accessible Jan 11 05:02:52 mandarin kernel: ata5.00: configured for UDMA/133 Jan 11 05:02:52 mandarin kernel: ata5: EH complete Jan 11 05:03:23 mandarin kernel: ata5.00: NCQ disabled due to excessive errors Jan 11 05:03:23 mandarin kernel: ata5.00: exception Emask 0x0 SAct 0x7f007fff SErr 0x0 action 0x6 frozen Jan 11 05:03:23 mandarin kernel: ata5.00: failed command: WRITE FPDMA QUEUED Jan 11 05:03:23 mandarin kernel: ata5.00: cmd 61/20:00:88:93:f2/00:00:01:00:00/40 tag 0 ncq 16384 out Jan 11 05:03:23 mandarin kernel: res 40/00:01:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) Jan 11 05:03:23 mandarin kernel: ata5.00: status: { DRDY } Jan 11 05:03:23 mandarin kernel: ata5.00: failed command: WRITE FPDMA QUEUED Jan 11 05:03:23 mandarin kernel: ata5.00: cmd 61/10:08:a8:93:f2/00:00:01:00:00/40 tag 1 ncq 8192 out Jan 11 05:03:23 mandarin kernel: res 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) Jan 11 05:03:23 mandarin kernel: ata5.00: status: { DRDY } Jan 11 05:03:23 mandarin kernel: ata5.00: failed command: WRITE FPDMA QUEUED Jan 11 05:03:23 mandarin kernel: ata5.00: cmd 61/10:10:b8:93:f2/00:00:01:00:00/40 tag 2 ncq 8192 out Jan 11 05:03:23 mandarin kernel: res 40/00:01:01:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) Jan 11 05:03:23 mandarin kernel: ata5.00: status: { DRDY } Jan 11 05:03:23 mandarin kernel: ata5.00: failed command: WRITE FPDMA QUEUED Jan 11 05:03:23 mandarin kernel: ata5.00: cmd 61/10:18:e0:92:f2/00:00:01:00:00/40 tag 3 ncq 8192 out Jan 11 05:03:23 mandarin kernel: res 40/00:01:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) Jan 11 05:03:23 mandarin kernel: ata5.00: status: { DRDY } Jan 11 05:03:23 mandarin kernel: ata5.00: failed command: WRITE FPDMA QUEUED Jan 11 05:03:23 mandarin kernel: ata5.00: cmd 61/00:20:f0:c9:62/04:00:00:00:00/40 tag 4 ncq 524288 out Jan 11 05:03:23 mandarin kernel: res 40/00:ff:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) Jan 11 05:03:23 mandarin kernel: ata5.00: status: { DRDY } Jan 11 05:03:23 mandarin kernel: ata5.00: failed command: WRITE FPDMA QUEUED Jan 11 05:03:23 mandarin kernel: ata5.00: cmd 61/c0:28:20:fb:50/02:00:00:00:00/40 tag 5 ncq 360448 out Jan 11 05:03:23 mandarin kernel: res 40/00:01:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) Jan 11 05:03:23 mandarin kernel: ata5.00: status: { DRDY } Jan 11 05:03:23 mandarin kernel: ata5.00: failed command: WRITE FPDMA QUEUED Jan 11 05:03:23 mandarin kernel: ata5.00: cmd 61/c0:30:20:fb:30/02:00:00:00:00/40 tag 6 ncq 360448 out Jan 11 05:03:23 mandarin kernel: res 40/00:ff:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) Jan 11 05:03:23 mandarin kernel: ata5.00: status: { DRDY } Jan 11 05:03:23 mandarin kernel: ata5.00: failed command: WRITE FPDMA QUEUED Jan 11 05:03:23 mandarin kernel: ata5.00: cmd 61/28:38:b8:92:f2/00:00:01:00:00/40 tag 7 ncq 20480 out Jan 11 05:03:23 mandarin kernel: res 40/00:01:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) Jan 11 05:03:23 mandarin kernel: ata5.00: status: { DRDY } Jan 11 05:03:23 mandarin kernel: ata5.00: failed command: WRITE FPDMA QUEUED Jan 11 05:03:23 mandarin kernel: ata5.00: cmd 61/08:40:b0:92:f2/00:00:01:00:00/40 tag 8 ncq 4096 out Jan 11 05:03:23 mandarin kernel: res 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) Jan 11 05:03:23 mandarin kernel: ata5.00: status: { DRDY } Jan 11 05:03:23 mandarin kernel: ata5.00: failed command: WRITE FPDMA QUEUED Jan 11 05:03:23 mandarin kernel: ata5.00: cmd 61/10:48:a0:92:f2/00:00:01:00:00/40 tag 9 ncq 8192 out Jan 11 05:03:23 mandarin kernel: res 40/00:01:01:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) Jan 11 05:03:23 mandarin kernel: ata5.00: status: { DRDY } Jan 11 05:03:23 mandarin kernel: ata5.00: failed command: WRITE FPDMA QUEUED Jan 11 05:03:23 mandarin kernel: ata5.00: cmd 61/08:50:98:92:f2/00:00:01:00:00/40 tag 10 ncq 4096 out Jan 11 05:03:23 mandarin kernel: res 40/00:00:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) Jan 11 05:03:23 mandarin kernel: ata5.00: status: { DRDY } Jan 11 05:03:23 mandarin kernel: ata5.00: failed command: WRITE FPDMA QUEUED Jan 11 05:03:23 mandarin kernel: ata5.00: cmd 61/00:58:98:90:f2/02:00:01:00:00/40 tag 11 ncq 262144 out Jan 11 05:03:23 mandarin kernel: res 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) Jan 11 05:03:23 mandarin kernel: ata5.00: status: { DRDY } Jan 11 05:03:23 mandarin kernel: ata5.00: failed command: SEND FPDMA QUEUED Jan 11 05:03:23 mandarin kernel: ata5.00: cmd 64/01:60:00:00:00/00:00:00:00:00/a0 tag 12 ncq 512 out Jan 11 05:03:23 mandarin kernel: res 40/00:01:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) Jan 11 05:03:23 mandarin kernel: ata5.00: status: { DRDY } Jan 11 05:03:23 mandarin kernel: ata5.00: failed command: WRITE FPDMA QUEUED Jan 11 05:03:23 mandarin kernel: ata5.00: cmd 61/00:68:98:8e:f2/02:00:01:00:00/40 tag 13 ncq 262144 out Jan 11 05:03:23 mandarin kernel: res 40/00:ff:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) Jan 11 05:03:23 mandarin kernel: ata5.00: status: { DRDY } Jan 11 05:03:23 mandarin kernel: ata5.00: failed command: WRITE FPDMA QUEUED Jan 11 05:03:23 mandarin kernel: ata5.00: cmd 61/18:70:f0:92:f2/00:00:01:00:00/40 tag 14 ncq 12288 out Jan 11 05:03:23 mandarin kernel: res 40/00:01:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) Jan 11 05:03:23 mandarin kernel: ata5.00: status: { DRDY } Jan 11 05:03:23 mandarin kernel: ata5.00: failed command: WRITE FPDMA QUEUED Jan 11 05:03:23 mandarin kernel: ata5.00: cmd 61/50:c0:08:93:f2/00:00:01:00:00/40 tag 24 ncq 40960 out Jan 11 05:03:23 mandarin kernel: res 40/00:01:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) Jan 11 05:03:23 mandarin kernel: ata5.00: status: { DRDY } Jan 11 05:03:23 mandarin kernel: ata5.00: failed command: WRITE FPDMA QUEUED Jan 11 05:03:23 mandarin kernel: ata5.00: cmd 61/10:c8:08:94:f2/00:00:01:00:00/40 tag 25 ncq 8192 out Jan 11 05:03:23 mandarin kernel: res 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) Jan 11 05:03:23 mandarin kernel: ata5.00: status: { DRDY } Jan 11 05:03:23 mandarin kernel: ata5.00: failed command: WRITE FPDMA QUEUED Jan 11 05:03:23 mandarin kernel: ata5.00: cmd 61/20:d0:d8:93:f2/00:00:01:00:00/40 tag 26 ncq 16384 out Jan 11 05:03:23 mandarin kernel: res 40/00:01:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) Jan 11 05:03:23 mandarin kernel: ata5.00: status: { DRDY } Jan 11 05:03:23 mandarin kernel: ata5.00: failed command: WRITE FPDMA QUEUED Jan 11 05:03:23 mandarin kernel: ata5.00: cmd 61/10:d8:c8:93:f2/00:00:01:00:00/40 tag 27 ncq 8192 out Jan 11 05:03:23 mandarin kernel: res 40/00:ff:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) Jan 11 05:03:23 mandarin kernel: ata5.00: status: { DRDY } Jan 11 05:03:23 mandarin kernel: ata5.00: failed command: WRITE FPDMA QUEUED Jan 11 05:03:23 mandarin kernel: ata5.00: cmd 61/10:e0:f8:93:f2/00:00:01:00:00/40 tag 28 ncq 8192 out Jan 11 05:03:23 mandarin kernel: res 40/00:01:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) Jan 11 05:03:23 mandarin kernel: ata5.00: status: { DRDY } Jan 11 05:03:23 mandarin kernel: ata5.00: failed command: WRITE FPDMA QUEUED Jan 11 05:03:23 mandarin kernel: ata5.00: cmd 61/20:e8:58:93:f2/00:00:01:00:00/40 tag 29 ncq 16384 out Jan 11 05:03:23 mandarin kernel: res 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) Jan 11 05:03:23 mandarin kernel: ata5.00: status: { DRDY } Jan 11 05:03:23 mandarin kernel: ata5.00: failed command: WRITE FPDMA QUEUED Jan 11 05:03:23 mandarin kernel: ata5.00: cmd 61/10:f0:78:93:f2/00:00:01:00:00/40 tag 30 ncq 8192 out Jan 11 05:03:23 mandarin kernel: res 40/00:01:01:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) Jan 11 05:03:23 mandarin kernel: ata5.00: status: { DRDY } Jan 11 05:03:23 mandarin kernel: ata5: hard resetting link Jan 11 05:03:23 mandarin kernel: ata5: SATA link up 3.0 Gbps (SStatus 123 SControl 300) Jan 11 05:03:23 mandarin kernel: ata5.00: supports DRM functions and may not be fully accessible Jan 11 05:03:23 mandarin kernel: ata5.00: supports DRM functions and may not be fully accessible Jan 11 05:03:23 mandarin kernel: ata5.00: configured for UDMA/133 Jan 11 05:03:23 mandarin kernel: ata5: EH complete Jan 11 05:03:30 mandarin root: /mnt/cache: 77.6 GiB (83277766656 bytes) trimmed
January 11, 20179 yr Community Expert Parity check should have no effect on your cache device, first thing to do is to check/replace cables.
January 12, 20179 yr My guess is that the drive locks out other I/O (perhaps only the writes?) after receiving the trim request, until it's complete. It was clearly not responding to write requests until after the trim completed. I suspect it even ignored the hard resets during the trim, which is somewhat surprising, but probably the safe thing to do. You may want to schedule the trim when nothing should be writing to the Cache drive.
January 12, 20179 yr Community Expert That's also a possibility, didn't think of that, although 40 seconds is long for a trim, but maybe because of the errors.
January 12, 20179 yr Thanks, replaced the cable and hold my thumbs It worked? I have a very similar problem and I wonder whether I should replace the cable or just try to find a different solution.
January 12, 20179 yr Author I changed the SATA cable last night. Just looked at the logs and the same error still appears during TRIM operation. It's actually more than 40 seconds, it was only a portion of the log I attached yesterday. Attached is the full syslog staring from when the TRIM operation started(doesn't seem to indicate start in the log, but the completion is) until it ended, so it's more like 2mins 20sec. I've never seen these errors/warnings before so it must be related to TRIM, hopefully these errors could be ignored. I'll see if I can schedule my dockers to stop before TRIM and start after say 5mins to make sure nothing is writing to the cache during TRIM. But then again perhaps these errors aren't "dangerous" log.txt
January 14, 20179 yr Author Just an update. The error didn't appear when TRIM ran this morning, I'll check the logs the next couple of days but hopefully it's automagically "fixed"
January 14, 20179 yr Can you tell us something about the make/model of the SSD and how it's connected to your server? Is it connected to the port that HP intends for the optional optical drive or have you put a SATA card in the PCIe expansion slot? It would be worth posting your diagnostics to see if there's anything else strange going on.
January 22, 20179 yr Author Problem solved, the errors disappeared after TRIM had run twice But just for info: My SSD at that time was a 120GB Samsung EVO 750, but has since replaced it with a Samsung 250GB EVO 750. It's connected to the OOD SATA port. Had the same strange error the first time TRIM was run on the new disk, but also it disappeared since. I'll upload a diagnostics file if I see the errors again since they're gone now.
Archived
This topic is now archived and is closed to further replies.