dannen Posted January 11, 2017 Share Posted January 11, 2017 I installed dynamics TRIM plugin yesterday which was scheduled to start at 05:00am. During the same time a parity check was running on my array. When I looked in the syslog when I got home from work I see a lot of disk errors from this morning. Does anyone know what this might be about, should I be concerned? I really don't know if the errors are due to running the parity check or the TRIM plugin, but i's about the same time as the TRIM is scheduled to run. the ata5 drive is my chache drive which is the only SSD disk in my server. ------- Jan 11 05:02:51 mandarin kernel: ata5: hard resetting link Jan 11 05:02:52 mandarin kernel: ata5: SATA link up 3.0 Gbps (SStatus 123 SControl 300) Jan 11 05:02:52 mandarin kernel: ata5.00: supports DRM functions and may not be fully accessible Jan 11 05:02:52 mandarin kernel: ata5.00: supports DRM functions and may not be fully accessible Jan 11 05:02:52 mandarin kernel: ata5.00: configured for UDMA/133 Jan 11 05:02:52 mandarin kernel: ata5: EH complete Jan 11 05:03:23 mandarin kernel: ata5.00: NCQ disabled due to excessive errors Jan 11 05:03:23 mandarin kernel: ata5.00: exception Emask 0x0 SAct 0x7f007fff SErr 0x0 action 0x6 frozen Jan 11 05:03:23 mandarin kernel: ata5.00: failed command: WRITE FPDMA QUEUED Jan 11 05:03:23 mandarin kernel: ata5.00: cmd 61/20:00:88:93:f2/00:00:01:00:00/40 tag 0 ncq 16384 out Jan 11 05:03:23 mandarin kernel: res 40/00:01:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) Jan 11 05:03:23 mandarin kernel: ata5.00: status: { DRDY } Jan 11 05:03:23 mandarin kernel: ata5.00: failed command: WRITE FPDMA QUEUED Jan 11 05:03:23 mandarin kernel: ata5.00: cmd 61/10:08:a8:93:f2/00:00:01:00:00/40 tag 1 ncq 8192 out Jan 11 05:03:23 mandarin kernel: res 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) Jan 11 05:03:23 mandarin kernel: ata5.00: status: { DRDY } Jan 11 05:03:23 mandarin kernel: ata5.00: failed command: WRITE FPDMA QUEUED Jan 11 05:03:23 mandarin kernel: ata5.00: cmd 61/10:10:b8:93:f2/00:00:01:00:00/40 tag 2 ncq 8192 out Jan 11 05:03:23 mandarin kernel: res 40/00:01:01:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) Jan 11 05:03:23 mandarin kernel: ata5.00: status: { DRDY } Jan 11 05:03:23 mandarin kernel: ata5.00: failed command: WRITE FPDMA QUEUED Jan 11 05:03:23 mandarin kernel: ata5.00: cmd 61/10:18:e0:92:f2/00:00:01:00:00/40 tag 3 ncq 8192 out Jan 11 05:03:23 mandarin kernel: res 40/00:01:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) Jan 11 05:03:23 mandarin kernel: ata5.00: status: { DRDY } Jan 11 05:03:23 mandarin kernel: ata5.00: failed command: WRITE FPDMA QUEUED Jan 11 05:03:23 mandarin kernel: ata5.00: cmd 61/00:20:f0:c9:62/04:00:00:00:00/40 tag 4 ncq 524288 out Jan 11 05:03:23 mandarin kernel: res 40/00:ff:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) Jan 11 05:03:23 mandarin kernel: ata5.00: status: { DRDY } Jan 11 05:03:23 mandarin kernel: ata5.00: failed command: WRITE FPDMA QUEUED Jan 11 05:03:23 mandarin kernel: ata5.00: cmd 61/c0:28:20:fb:50/02:00:00:00:00/40 tag 5 ncq 360448 out Jan 11 05:03:23 mandarin kernel: res 40/00:01:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) Jan 11 05:03:23 mandarin kernel: ata5.00: status: { DRDY } Jan 11 05:03:23 mandarin kernel: ata5.00: failed command: WRITE FPDMA QUEUED Jan 11 05:03:23 mandarin kernel: ata5.00: cmd 61/c0:30:20:fb:30/02:00:00:00:00/40 tag 6 ncq 360448 out Jan 11 05:03:23 mandarin kernel: res 40/00:ff:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) Jan 11 05:03:23 mandarin kernel: ata5.00: status: { DRDY } Jan 11 05:03:23 mandarin kernel: ata5.00: failed command: WRITE FPDMA QUEUED Jan 11 05:03:23 mandarin kernel: ata5.00: cmd 61/28:38:b8:92:f2/00:00:01:00:00/40 tag 7 ncq 20480 out Jan 11 05:03:23 mandarin kernel: res 40/00:01:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) Jan 11 05:03:23 mandarin kernel: ata5.00: status: { DRDY } Jan 11 05:03:23 mandarin kernel: ata5.00: failed command: WRITE FPDMA QUEUED Jan 11 05:03:23 mandarin kernel: ata5.00: cmd 61/08:40:b0:92:f2/00:00:01:00:00/40 tag 8 ncq 4096 out Jan 11 05:03:23 mandarin kernel: res 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) Jan 11 05:03:23 mandarin kernel: ata5.00: status: { DRDY } Jan 11 05:03:23 mandarin kernel: ata5.00: failed command: WRITE FPDMA QUEUED Jan 11 05:03:23 mandarin kernel: ata5.00: cmd 61/10:48:a0:92:f2/00:00:01:00:00/40 tag 9 ncq 8192 out Jan 11 05:03:23 mandarin kernel: res 40/00:01:01:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) Jan 11 05:03:23 mandarin kernel: ata5.00: status: { DRDY } Jan 11 05:03:23 mandarin kernel: ata5.00: failed command: WRITE FPDMA QUEUED Jan 11 05:03:23 mandarin kernel: ata5.00: cmd 61/08:50:98:92:f2/00:00:01:00:00/40 tag 10 ncq 4096 out Jan 11 05:03:23 mandarin kernel: res 40/00:00:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) Jan 11 05:03:23 mandarin kernel: ata5.00: status: { DRDY } Jan 11 05:03:23 mandarin kernel: ata5.00: failed command: WRITE FPDMA QUEUED Jan 11 05:03:23 mandarin kernel: ata5.00: cmd 61/00:58:98:90:f2/02:00:01:00:00/40 tag 11 ncq 262144 out Jan 11 05:03:23 mandarin kernel: res 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) Jan 11 05:03:23 mandarin kernel: ata5.00: status: { DRDY } Jan 11 05:03:23 mandarin kernel: ata5.00: failed command: SEND FPDMA QUEUED Jan 11 05:03:23 mandarin kernel: ata5.00: cmd 64/01:60:00:00:00/00:00:00:00:00/a0 tag 12 ncq 512 out Jan 11 05:03:23 mandarin kernel: res 40/00:01:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) Jan 11 05:03:23 mandarin kernel: ata5.00: status: { DRDY } Jan 11 05:03:23 mandarin kernel: ata5.00: failed command: WRITE FPDMA QUEUED Jan 11 05:03:23 mandarin kernel: ata5.00: cmd 61/00:68:98:8e:f2/02:00:01:00:00/40 tag 13 ncq 262144 out Jan 11 05:03:23 mandarin kernel: res 40/00:ff:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) Jan 11 05:03:23 mandarin kernel: ata5.00: status: { DRDY } Jan 11 05:03:23 mandarin kernel: ata5.00: failed command: WRITE FPDMA QUEUED Jan 11 05:03:23 mandarin kernel: ata5.00: cmd 61/18:70:f0:92:f2/00:00:01:00:00/40 tag 14 ncq 12288 out Jan 11 05:03:23 mandarin kernel: res 40/00:01:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) Jan 11 05:03:23 mandarin kernel: ata5.00: status: { DRDY } Jan 11 05:03:23 mandarin kernel: ata5.00: failed command: WRITE FPDMA QUEUED Jan 11 05:03:23 mandarin kernel: ata5.00: cmd 61/50:c0:08:93:f2/00:00:01:00:00/40 tag 24 ncq 40960 out Jan 11 05:03:23 mandarin kernel: res 40/00:01:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) Jan 11 05:03:23 mandarin kernel: ata5.00: status: { DRDY } Jan 11 05:03:23 mandarin kernel: ata5.00: failed command: WRITE FPDMA QUEUED Jan 11 05:03:23 mandarin kernel: ata5.00: cmd 61/10:c8:08:94:f2/00:00:01:00:00/40 tag 25 ncq 8192 out Jan 11 05:03:23 mandarin kernel: res 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) Jan 11 05:03:23 mandarin kernel: ata5.00: status: { DRDY } Jan 11 05:03:23 mandarin kernel: ata5.00: failed command: WRITE FPDMA QUEUED Jan 11 05:03:23 mandarin kernel: ata5.00: cmd 61/20:d0:d8:93:f2/00:00:01:00:00/40 tag 26 ncq 16384 out Jan 11 05:03:23 mandarin kernel: res 40/00:01:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) Jan 11 05:03:23 mandarin kernel: ata5.00: status: { DRDY } Jan 11 05:03:23 mandarin kernel: ata5.00: failed command: WRITE FPDMA QUEUED Jan 11 05:03:23 mandarin kernel: ata5.00: cmd 61/10:d8:c8:93:f2/00:00:01:00:00/40 tag 27 ncq 8192 out Jan 11 05:03:23 mandarin kernel: res 40/00:ff:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) Jan 11 05:03:23 mandarin kernel: ata5.00: status: { DRDY } Jan 11 05:03:23 mandarin kernel: ata5.00: failed command: WRITE FPDMA QUEUED Jan 11 05:03:23 mandarin kernel: ata5.00: cmd 61/10:e0:f8:93:f2/00:00:01:00:00/40 tag 28 ncq 8192 out Jan 11 05:03:23 mandarin kernel: res 40/00:01:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) Jan 11 05:03:23 mandarin kernel: ata5.00: status: { DRDY } Jan 11 05:03:23 mandarin kernel: ata5.00: failed command: WRITE FPDMA QUEUED Jan 11 05:03:23 mandarin kernel: ata5.00: cmd 61/20:e8:58:93:f2/00:00:01:00:00/40 tag 29 ncq 16384 out Jan 11 05:03:23 mandarin kernel: res 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) Jan 11 05:03:23 mandarin kernel: ata5.00: status: { DRDY } Jan 11 05:03:23 mandarin kernel: ata5.00: failed command: WRITE FPDMA QUEUED Jan 11 05:03:23 mandarin kernel: ata5.00: cmd 61/10:f0:78:93:f2/00:00:01:00:00/40 tag 30 ncq 8192 out Jan 11 05:03:23 mandarin kernel: res 40/00:01:01:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) Jan 11 05:03:23 mandarin kernel: ata5.00: status: { DRDY } Jan 11 05:03:23 mandarin kernel: ata5: hard resetting link Jan 11 05:03:23 mandarin kernel: ata5: SATA link up 3.0 Gbps (SStatus 123 SControl 300) Jan 11 05:03:23 mandarin kernel: ata5.00: supports DRM functions and may not be fully accessible Jan 11 05:03:23 mandarin kernel: ata5.00: supports DRM functions and may not be fully accessible Jan 11 05:03:23 mandarin kernel: ata5.00: configured for UDMA/133 Jan 11 05:03:23 mandarin kernel: ata5: EH complete Jan 11 05:03:30 mandarin root: /mnt/cache: 77.6 GiB (83277766656 bytes) trimmed Quote Link to comment
JorgeB Posted January 11, 2017 Share Posted January 11, 2017 Parity check should have no effect on your cache device, first thing to do is to check/replace cables. Quote Link to comment
dannen Posted January 11, 2017 Author Share Posted January 11, 2017 Thanks, replaced the cable and hold my thumbs Quote Link to comment
RobJ Posted January 12, 2017 Share Posted January 12, 2017 My guess is that the drive locks out other I/O (perhaps only the writes?) after receiving the trim request, until it's complete. It was clearly not responding to write requests until after the trim completed. I suspect it even ignored the hard resets during the trim, which is somewhat surprising, but probably the safe thing to do. You may want to schedule the trim when nothing should be writing to the Cache drive. Quote Link to comment
JorgeB Posted January 12, 2017 Share Posted January 12, 2017 That's also a possibility, didn't think of that, although 40 seconds is long for a trim, but maybe because of the errors. Quote Link to comment
Isabella83 Posted January 12, 2017 Share Posted January 12, 2017 Thanks, replaced the cable and hold my thumbs It worked? I have a very similar problem and I wonder whether I should replace the cable or just try to find a different solution. Quote Link to comment
dannen Posted January 12, 2017 Author Share Posted January 12, 2017 I changed the SATA cable last night. Just looked at the logs and the same error still appears during TRIM operation. It's actually more than 40 seconds, it was only a portion of the log I attached yesterday. Attached is the full syslog staring from when the TRIM operation started(doesn't seem to indicate start in the log, but the completion is) until it ended, so it's more like 2mins 20sec. I've never seen these errors/warnings before so it must be related to TRIM, hopefully these errors could be ignored. I'll see if I can schedule my dockers to stop before TRIM and start after say 5mins to make sure nothing is writing to the cache during TRIM. But then again perhaps these errors aren't "dangerous" log.txt Quote Link to comment
dannen Posted January 14, 2017 Author Share Posted January 14, 2017 Just an update. The error didn't appear when TRIM ran this morning, I'll check the logs the next couple of days but hopefully it's automagically "fixed" Quote Link to comment
John_M Posted January 14, 2017 Share Posted January 14, 2017 Can you tell us something about the make/model of the SSD and how it's connected to your server? Is it connected to the port that HP intends for the optional optical drive or have you put a SATA card in the PCIe expansion slot? It would be worth posting your diagnostics to see if there's anything else strange going on. Quote Link to comment
dannen Posted January 22, 2017 Author Share Posted January 22, 2017 Problem solved, the errors disappeared after TRIM had run twice But just for info: My SSD at that time was a 120GB Samsung EVO 750, but has since replaced it with a Samsung 250GB EVO 750. It's connected to the OOD SATA port. Had the same strange error the first time TRIM was run on the new disk, but also it disappeared since. I'll upload a diagnostics file if I see the errors again since they're gone now. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.