[SOLVED] failed command: READ FPDMA QUEUED - bad new SSD?


Recommended Posts

I purchased a 1TB Samsung SSD 860 EVO before Christmas and just kind of left it on the shelf till today.

 

Well I installed it in my server to replace an old but reliable 500GB 860 EVO and that's where the trouble started.

 

Upon copying data to the drive I started getting warnings about CRC errors and the SMART stats for it going up and variations of this continue to show up in the logs:

Feb  1 11:22:43 VOID kernel: ata6.00: irq_stat 0x08000000, interface fatal error
Feb  1 11:22:43 VOID kernel: ata6.00: failed command: WRITE FPDMA QUEUED
Feb  1 11:22:43 VOID kernel: ata6.00: cmd 61/e8:40:f8:ce:f0/09:00:02:00:00/40 tag 8 ncq dma 1298432 ou
Feb  1 11:22:43 VOID kernel:         res 40/00:40:f8:ce:f0/00:00:02:00:00/40 Emask 0x10 (ATA bus error)
Feb  1 11:22:43 VOID kernel: ata6.00: status: { DRDY }
Feb  1 11:22:43 VOID kernel: ata6.00: failed command: WRITE FPDMA QUEUED
Feb  1 11:22:43 VOID kernel: ata6.00: cmd 61/50:48:e0:d8:f0/09:00:02:00:00/40 tag 9 ncq dma 1220608 ou
Feb  1 11:22:43 VOID kernel:         res 40/00:40:f8:ce:f0/00:00:02:00:00/40 Emask 0x10 (ATA bus error)
Feb  1 11:22:43 VOID kernel: ata6.00: status: { DRDY }
Feb  1 11:22:43 VOID kernel: ata6.00: failed command: WRITE FPDMA QUEUED
Feb  1 11:22:43 VOID kernel: ata6.00: cmd 61/d8:70:30:e2:f0/09:00:02:00:00/40 tag 14 ncq dma 1290240 ou
Feb  1 11:22:43 VOID kernel:         res 40/00:40:f8:ce:f0/00:00:02:00:00/40 Emask 0x10 (ATA bus error)
Feb  1 11:22:43 VOID kernel: ata6.00: status: { DRDY }
Feb  1 11:22:43 VOID kernel: ata6.00: failed command: WRITE FPDMA QUEUED
Feb  1 11:22:43 VOID kernel: ata6.00: cmd 61/90:78:08:ec:f0/09:00:02:00:00/40 tag 15 ncq dma 1253376 ou
Feb  1 11:22:43 VOID kernel:         res 40/00:40:f8:ce:f0/00:00:02:00:00/40 Emask 0x10 (ATA bus error)
Feb  1 11:22:43 VOID kernel: ata6.00: status: { DRDY }
Feb  1 11:22:43 VOID kernel: ata6.00: failed command: WRITE FPDMA QUEUED
Feb  1 11:22:43 VOID kernel: ata6.00: cmd 61/f8:80:98:f5:f0/09:00:02:00:00/40 tag 16 ncq dma 1306624 ou
Feb  1 11:22:43 VOID kernel:         res 40/00:40:f8:ce:f0/00:00:02:00:00/40 Emask 0x10 (ATA bus error)
Feb  1 11:22:43 VOID kernel: ata6.00: status: { DRDY }
Feb  1 11:22:43 VOID kernel: ata6.00: failed command: WRITE FPDMA QUEUED
Feb  1 11:22:43 VOID kernel: ata6.00: cmd 61/e0:a8:90:ff:f0/09:00:02:00:00/40 tag 21 ncq dma 1294336 ou
Feb  1 11:22:43 VOID kernel:         res 40/00:40:f8:ce:f0/00:00:02:00:00/40 Emask 0x10 (ATA bus error)
Feb  1 11:22:43 VOID kernel: ata6.00: status: { DRDY }

 

So I shut it down and swapped out:

 

3 different SATA cables, 2 molex to sata power adaters, and the port on the board with another working port on the motherboard.

 

When I changed the port on the mobo the error changed somewhat, and appeared when i first booted before copying any data:

Feb 1 18:13:05 VOID kernel: ata1.00: READ LOG DMA EXT failed, trying PIO
Feb 1 18:13:05 VOID kernel: ata1.00: exception Emask 0x0 SAct 0xffffffff SErr 0x0 action 0x6
Feb 1 18:13:05 VOID kernel: ata1.00: irq_stat 0x40000008
Feb 1 18:13:05 VOID kernel: ata1.00: failed command: READ FPDMA QUEUED
Feb 1 18:13:05 VOID kernel: ata1.00: cmd 60/20:20:28:1a:25/00:00:0e:00:00/40 tag 4 ncq dma 16384 in
Feb 1 18:13:05 VOID kernel: res 41/84:20:28:1a:25/00:00:0e:00:00/00 Emask 0x410 (ATA bus error) <F>
Feb 1 18:13:05 VOID kernel: ata1.00: status: { DRDY ERR }
Feb 1 18:13:05 VOID kernel: ata1.00: error: { ICRC ABRT }
Feb 1 18:13:05 VOID kernel: ata1: hard resetting link
Feb 1 18:13:05 VOID kernel: ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
Feb 1 18:13:05 VOID kernel: ata1.00: supports DRM functions and may not be fully accessible
Feb 1 18:13:05 VOID kernel: ata1.00: supports DRM functions and may not be fully accessible
Feb 1 18:13:05 VOID kernel: ata1.00: configured for UDMA/133
Feb 1 18:13:05 VOID kernel: sd 1:0:0:0: [sdb] tag#4 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08 cmd_age=0s
Feb 1 18:13:05 VOID kernel: sd 1:0:0:0: [sdb] tag#4 Sense Key : 0xb [current]
Feb 1 18:13:05 VOID kernel: sd 1:0:0:0: [sdb] tag#4 ASC=0x47 ASCQ=0x0
Feb 1 18:13:05 VOID kernel: sd 1:0:0:0: [sdb] tag#4 CDB: opcode=0x28 28 00 0e 25 1a 28 00 00 20 00
Feb 1 18:13:05 VOID kernel: blk_update_request: I/O error, dev sdb, sector 237312552 op 0x0:(READ) flags 0x80700 phys_seg 1 prio class 0
Feb 1 18:13:05 VOID kernel: ata1: EH complete
Feb 1 18:13:05 VOID kernel: ata1.00: Enabling discard_zeroes_data

 

It hasn't re-appeared yet, I'm copying more data as a test to see if it happens again. But so far the issue continues to follow the drive, so I think I might have gotten a defective one. Can anyone confirm? Other ideas I could try?

 

On a related note, anyone have any experience with warrantying this type of issue with samsung? Like in idiot I have missed my return window with Amazon.

 

void-diagnostics-20210201-1818.zip

syslog-20210201-114006.txt

syslog-20210201-161732.txt

syslog-20210201-175504.txt

 

 

EDIT: Still doing it, same errors as before:

Feb 1 19:08:20 VOID kernel: ata1.00: exception Emask 0x10 SAct 0x80800000 SErr 0x0 action 0x6 frozen
Feb 1 19:08:20 VOID kernel: ata1.00: irq_stat 0x08000000, interface fatal error
Feb 1 19:08:20 VOID kernel: ata1.00: failed command: WRITE FPDMA QUEUED
Feb 1 19:08:20 VOID kernel: ata1.00: cmd 61/60:b8:00:7e:db/00:00:05:00:00/40 tag 23 ncq dma 49152 out
Feb 1 19:08:20 VOID kernel: res 40/00:b8:00:7e:db/00:00:05:00:00/40 Emask 0x10 (ATA bus error)
Feb 1 19:08:20 VOID kernel: ata1.00: status: { DRDY }
Feb 1 19:08:20 VOID kernel: ata1.00: failed command: WRITE FPDMA QUEUED
Feb 1 19:08:20 VOID kernel: ata1.00: cmd 61/60:f8:80:7e:db/00:00:05:00:00/40 tag 31 ncq dma 49152 out
Feb 1 19:08:20 VOID kernel: res 40/00:b8:00:7e:db/00:00:05:00:00/40 Emask 0x10 (ATA bus error)
Feb 1 19:08:20 VOID kernel: ata1.00: status: { DRDY }
Feb 1 19:08:20 VOID kernel: ata1: hard resetting link
Feb 1 19:08:21 VOID kernel: ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
Feb 1 19:08:21 VOID kernel: ata1.00: supports DRM functions and may not be fully accessible
Feb 1 19:08:21 VOID kernel: ata1.00: supports DRM functions and may not be fully accessible
Feb 1 19:08:21 VOID kernel: ata1.00: configured for UDMA/133
Feb 1 19:08:21 VOID kernel: ata1: EH complete
Feb 1 19:08:21 VOID kernel: ata1.00: Enabling discard_zeroes_data

 

Edited by weirdcrap
unsolved, errors are back
Link to comment
12 hours ago, JorgeB said:

Samsung SSDs and older AMD chipsets don't usually go well, try connecting it to the LSI, you'll lose trim support but still worth trying.

Really? That's interesting, the old one worked great for the last 5 years, not a single error.

 

After the most recent two errors all has been quiet and I have a large data transfer going so I'll have to try this in a couple days.

 

EDIT: and just like that I jinxed myself, tons more errors now. Fantastic.

 

EDIT2: Ok, I shuffled stuff around and put the SSD on the asmedia controller I have the parity drive on and so far so good on boot up. I'll have to start copying data and see if it shows up still.

 

EDIT3: Yeah moving it off the AMD controller solved it. Per usual JorgeB has the answers.

Edited by weirdcrap
Link to comment
  • weirdcrap changed the title to [SOLVED] failed command: READ FPDMA QUEUED - bad new SSD?

@JorgeB  It waited 2 days to resurface but the errors are back, even on a different controller with different sata and power cables...

 

 

Feb  5 00:45:10 VOID crond[1837]: exit status 1 from user root /usr/local/sbin/mover &> /dev/null
Feb  5 01:00:58 VOID kernel: ata7.00: exception Emask 0x0 SAct 0xc0c00041 SErr 0x0 action 0x6 frozen
Feb  5 01:00:58 VOID kernel: ata7.00: failed command: WRITE FPDMA QUEUED
Feb  5 01:00:58 VOID kernel: ata7.00: cmd 61/08:00:b8:ca:00/00:00:00:00:00/40 tag 0 ncq dma 4096 out
Feb  5 01:00:58 VOID kernel:         res 40/00:01:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Feb  5 01:00:58 VOID kernel: ata7.00: status: { DRDY }
Feb  5 01:00:58 VOID kernel: ata7.00: failed command: READ FPDMA QUEUED
Feb  5 01:00:58 VOID kernel: ata7.00: cmd 60/08:30:00:20:cd/00:00:0c:00:00/40 tag 6 ncq dma 4096 in
Feb  5 01:00:58 VOID kernel:         res 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
Feb  5 01:00:58 VOID kernel: ata7.00: status: { DRDY }
Feb  5 01:00:58 VOID kernel: ata7.00: failed command: WRITE FPDMA QUEUED
Feb  5 01:00:58 VOID kernel: ata7.00: cmd 61/08:b0:28:a4:63/00:00:00:00:00/40 tag 22 ncq dma 4096 out
Feb  5 01:00:58 VOID kernel:         res 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
Feb  5 01:00:58 VOID kernel: ata7.00: status: { DRDY }
Feb  5 01:00:58 VOID kernel: ata7.00: failed command: WRITE FPDMA QUEUED
Feb  5 01:00:58 VOID kernel: ata7.00: cmd 61/08:b8:e8:72:65/00:00:00:00:00/40 tag 23 ncq dma 4096 out
Feb  5 01:00:58 VOID kernel:         res 40/00:01:01:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
Feb  5 01:00:58 VOID kernel: ata7.00: status: { DRDY }
Feb  5 01:00:58 VOID kernel: ata7.00: failed command: SEND FPDMA QUEUED
Feb  5 01:00:58 VOID kernel: ata7.00: cmd 64/01:f0:00:00:00/00:00:00:00:00/a0 tag 30 ncq dma 512 out
Feb  5 01:00:58 VOID kernel:         res 40/00:01:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Feb  5 01:00:58 VOID kernel: ata7.00: status: { DRDY }
Feb  5 01:00:58 VOID kernel: ata7.00: failed command: WRITE FPDMA QUEUED
Feb  5 01:00:58 VOID kernel: ata7.00: cmd 61/08:f8:b0:be:00/00:00:00:00:00/40 tag 31 ncq dma 4096 out
Feb  5 01:00:58 VOID kernel:         res 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
Feb  5 01:00:58 VOID kernel: ata7.00: status: { DRDY }
Feb  5 01:00:58 VOID kernel: ata7: hard resetting link
Feb  5 01:00:58 VOID kernel: ata7: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
Feb  5 01:00:58 VOID kernel: ata7.00: supports DRM functions and may not be fully accessible
Feb  5 01:00:58 VOID kernel: ata7.00: supports DRM functions and may not be fully accessible
Feb  5 01:00:58 VOID kernel: ata7.00: configured for UDMA/133
Feb  5 01:00:58 VOID kernel: ata7: EH complete
Feb  5 01:00:58 VOID kernel: ata7.00: Enabling discard_zeroes_data
Feb  5 01:00:58 VOID kernel: ata7.00: invalid checksum 0xdc on log page 10h
Feb  5 01:00:58 VOID kernel: ata7: log page 10h reported inactive tag 1
Feb  5 01:00:58 VOID kernel: ata7.00: exception Emask 0x1 SAct 0x1f8 SErr 0x0 action 0x0
Feb  5 01:00:58 VOID kernel: ata7.00: irq_stat 0x40000008
Feb  5 01:00:58 VOID kernel: ata7.00: failed command: WRITE FPDMA QUEUED
Feb  5 01:00:58 VOID kernel: ata7.00: cmd 61/08:18:b0:be:00/00:00:00:00:00/40 tag 3 ncq dma 4096 out
Feb  5 01:00:58 VOID kernel:         res 40/00:20:00:00:00/00:00:00:00:00/a0 Emask 0x1 (device error)
Feb  5 01:00:58 VOID kernel: ata7.00: status: { DRDY }
Feb  5 01:00:58 VOID kernel: ata7.00: failed command: SEND FPDMA QUEUED
Feb  5 01:00:58 VOID kernel: ata7.00: cmd 64/01:20:00:00:00/00:00:00:00:00/a0 tag 4 ncq dma 512 out
Feb  5 01:00:58 VOID kernel:         res 40/00:20:00:00:00/00:00:00:00:00/a0 Emask 0x1 (device error)
Feb  5 01:00:58 VOID kernel: ata7.00: status: { DRDY }
Feb  5 01:00:58 VOID kernel: ata7.00: failed command: WRITE FPDMA QUEUED
Feb  5 01:00:58 VOID kernel: ata7.00: cmd 61/08:28:e8:72:65/00:00:00:00:00/40 tag 5 ncq dma 4096 out
Feb  5 01:00:58 VOID kernel:         res 40/00:20:00:00:00/00:00:00:00:00/a0 Emask 0x1 (device error)
Feb  5 01:00:58 VOID kernel: ata7.00: status: { DRDY }
Feb  5 01:00:58 VOID kernel: ata7.00: failed command: WRITE FPDMA QUEUED
Feb  5 01:00:58 VOID kernel: ata7.00: cmd 61/08:30:28:a4:63/00:00:00:00:00/40 tag 6 ncq dma 4096 out
Feb  5 01:00:58 VOID kernel:         res 40/00:20:00:00:00/00:00:00:00:00/a0 Emask 0x1 (device error)
Feb  5 01:00:58 VOID kernel: ata7.00: status: { DRDY }
Feb  5 01:00:58 VOID kernel: ata7.00: failed command: READ FPDMA QUEUED
Feb  5 01:00:58 VOID kernel: ata7.00: cmd 60/08:38:00:20:cd/00:00:0c:00:00/40 tag 7 ncq dma 4096 in
Feb  5 01:00:58 VOID kernel:         res 40/00:20:00:00:00/00:00:00:00:00/a0 Emask 0x1 (device error)
Feb  5 01:00:58 VOID kernel: ata7.00: status: { DRDY }
Feb  5 01:00:58 VOID kernel: ata7.00: failed command: WRITE FPDMA QUEUED
Feb  5 01:00:58 VOID kernel: ata7.00: cmd 61/08:40:b8:ca:00/00:00:00:00:00/40 tag 8 ncq dma 4096 out
Feb  5 01:00:58 VOID kernel:         res 40/00:20:00:00:00/00:00:00:00:00/a0 Emask 0x1 (device error)
Feb  5 01:00:58 VOID kernel: ata7.00: status: { DRDY }
Feb  5 01:00:58 VOID kernel: ata7.00: failed to IDENTIFY (I/O error, err_mask=0x100)
Feb  5 01:00:58 VOID kernel: ata7.00: revalidation failed (errno=-5)
Feb  5 01:00:58 VOID kernel: ata7: hard resetting link
Feb  5 01:00:59 VOID kernel: ata7: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
Feb  5 01:00:59 VOID kernel: ata7.00: supports DRM functions and may not be fully accessible
Feb  5 01:00:59 VOID kernel: ata7.00: supports DRM functions and may not be fully accessible
Feb  5 01:00:59 VOID kernel: ata7.00: configured for UDMA/133
Feb  5 01:00:59 VOID kernel: ata7.00: device reported invalid CHS sector 0
Feb  5 01:00:59 VOID kernel: sd 8:0:0:0: [sdh] tag#4 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08 cmd_age=57s
Feb  5 01:00:59 VOID kernel: sd 8:0:0:0: [sdh] tag#4 Sense Key : 0x5 [current] 
Feb  5 01:00:59 VOID kernel: sd 8:0:0:0: [sdh] tag#4 ASC=0x21 ASCQ=0x4 
Feb  5 01:00:59 VOID kernel: sd 8:0:0:0: [sdh] tag#4 CDB: opcode=0x93 93 08 00 00 00 00 00 00 10 00 00 00 00 20 00 00
Feb  5 01:00:59 VOID kernel: blk_update_request: I/O error, dev sdh, sector 4096 op 0x3:(DISCARD) flags 0x800 phys_seg 1 prio class 0
Feb  5 01:00:59 VOID kernel: ata7: EH complete
Feb  5 01:00:59 VOID kernel: ata7.00: Enabling discard_zeroes_data
Feb  5 01:01:59 VOID kernel: ata7.00: exception Emask 0x0 SAct 0x10ff018 SErr 0x0 action 0x6 frozen
Feb  5 01:01:59 VOID kernel: ata7.00: failed command: WRITE FPDMA QUEUED
Feb  5 01:01:59 VOID kernel: ata7.00: cmd 61/10:18:d0:1f:cd/00:00:0c:00:00/40 tag 3 ncq dma 8192 out
Feb  5 01:01:59 VOID kernel:         res 40/00:20:00:00:00/00:00:00:00:00/a0 Emask 0x4 (timeout)
Feb  5 01:01:59 VOID kernel: ata7.00: status: { DRDY }
Feb  5 01:01:59 VOID kernel: ata7.00: failed command: WRITE FPDMA QUEUED
Feb  5 01:01:59 VOID kernel: ata7.00: cmd 61/08:20:e0:1f:cd/00:00:0c:00:00/40 tag 4 ncq dma 4096 out
Feb  5 01:01:59 VOID kernel:         res 40/00:20:00:00:00/00:00:00:00:00/a0 Emask 0x4 (timeout)
Feb  5 01:01:59 VOID kernel: ata7.00: status: { DRDY }
Feb  5 01:01:59 VOID kernel: ata7.00: failed command: WRITE FPDMA QUEUED
Feb  5 01:01:59 VOID kernel: ata7.00: cmd 61/20:60:c0:a4:c5/00:00:05:00:00/40 tag 12 ncq dma 16384 out
Feb  5 01:01:59 VOID kernel:         res 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
Feb  5 01:01:59 VOID kernel: ata7.00: status: { DRDY }
Feb  5 01:01:59 VOID kernel: ata7.00: failed command: SEND FPDMA QUEUED
Feb  5 01:01:59 VOID kernel: ata7.00: cmd 64/01:68:00:00:00/00:00:00:00:00/a0 tag 13 ncq dma 512 out
Feb  5 01:01:59 VOID kernel:         res 40/00:01:01:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
Feb  5 01:01:59 VOID kernel: ata7.00: status: { DRDY }
Feb  5 01:01:59 VOID kernel: ata7.00: failed command: WRITE FPDMA QUEUED
Feb  5 01:01:59 VOID kernel: ata7.00: cmd 61/20:70:40:a5:c5/00:00:05:00:00/40 tag 14 ncq dma 16384 out
Feb  5 01:01:59 VOID kernel:         res 40/00:01:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Feb  5 01:01:59 VOID kernel: ata7.00: status: { DRDY }
Feb  5 01:01:59 VOID kernel: ata7.00: failed command: WRITE FPDMA QUEUED
Feb  5 01:01:59 VOID kernel: ata7.00: cmd 61/20:78:a0:a5:c5/00:00:05:00:00/40 tag 15 ncq dma 16384 out
Feb  5 01:01:59 VOID kernel:         res 40/00:01:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Feb  5 01:01:59 VOID kernel: ata7.00: status: { DRDY }
Feb  5 01:01:59 VOID kernel: ata7.00: failed command: WRITE FPDMA QUEUED
Feb  5 01:01:59 VOID kernel: ata7.00: cmd 61/80:80:c0:35:2b/02:00:00:00:00/40 tag 16 ncq dma 327680 out
Feb  5 01:01:59 VOID kernel:         res 40/00:01:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Feb  5 01:01:59 VOID kernel: ata7.00: status: { DRDY }
Feb  5 01:01:59 VOID kernel: ata7.00: failed command: WRITE FPDMA QUEUED
Feb  5 01:01:59 VOID kernel: ata7.00: cmd 61/80:88:c0:35:33/02:00:00:00:00/40 tag 17 ncq dma 327680 out
Feb  5 01:01:59 VOID kernel:         res 40/00:01:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Feb  5 01:01:59 VOID kernel: ata7.00: status: { DRDY }
Feb  5 01:01:59 VOID kernel: ata7.00: failed command: WRITE FPDMA QUEUED
Feb  5 01:01:59 VOID kernel: ata7.00: cmd 61/08:90:58:75:65/00:00:00:00:00/40 tag 18 ncq dma 4096 out
Feb  5 01:01:59 VOID kernel:         res 40/00:01:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Feb  5 01:01:59 VOID kernel: ata7.00: status: { DRDY }
Feb  5 01:01:59 VOID kernel: ata7.00: failed command: WRITE FPDMA QUEUED
Feb  5 01:01:59 VOID kernel: ata7.00: cmd 61/08:98:90:77:65/00:00:00:00:00/40 tag 19 ncq dma 4096 out
Feb  5 01:01:59 VOID kernel:         res 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
Feb  5 01:01:59 VOID kernel: ata7.00: status: { DRDY }
Feb  5 01:01:59 VOID kernel: ata7.00: failed command: WRITE FPDMA QUEUED
Feb  5 01:01:59 VOID kernel: ata7.00: cmd 61/20:c0:20:a4:c5/00:00:05:00:00/40 tag 24 ncq dma 16384 out
Feb  5 01:01:59 VOID kernel:         res 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
Feb  5 01:01:59 VOID kernel: ata7.00: status: { DRDY }
Feb  5 01:01:59 VOID kernel: ata7: hard resetting link
Feb  5 01:02:00 VOID kernel: ata7: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
Feb  5 01:02:00 VOID kernel: ata7.00: supports DRM functions and may not be fully accessible
Feb  5 01:02:00 VOID kernel: ata7.00: supports DRM functions and may not be fully accessible
Feb  5 01:02:00 VOID kernel: ata7.00: configured for UDMA/133
Feb  5 01:02:00 VOID kernel: ata7.00: device reported invalid CHS sector 0
Feb  5 01:02:00 VOID kernel: ata7: EH complete
Feb  5 01:02:00 VOID kernel: ata7.00: Enabling discard_zeroes_data
Feb  5 01:02:30 VOID kernel: ata7.00: NCQ disabled due to excessive errors
Feb  5 01:02:30 VOID kernel: ata7.00: exception Emask 0x0 SAct 0x60000007 SErr 0x0 action 0x6 frozen
Feb  5 01:02:30 VOID kernel: ata7.00: failed command: WRITE FPDMA QUEUED
Feb  5 01:02:30 VOID kernel: ata7.00: cmd 61/20:00:c0:a4:c5/00:00:05:00:00/40 tag 0 ncq dma 16384 out
Feb  5 01:02:30 VOID kernel:         res 40/00:01:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Feb  5 01:02:30 VOID kernel: ata7.00: status: { DRDY }
Feb  5 01:02:30 VOID kernel: ata7.00: failed command: WRITE FPDMA QUEUED
Feb  5 01:02:30 VOID kernel: ata7.00: cmd 61/08:08:e0:1f:cd/00:00:0c:00:00/40 tag 1 ncq dma 4096 out
Feb  5 01:02:30 VOID kernel:         res 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
Feb  5 01:02:30 VOID kernel: ata7.00: status: { DRDY }
Feb  5 01:02:30 VOID kernel: ata7.00: failed command: WRITE FPDMA QUEUED
Feb  5 01:02:30 VOID kernel: ata7.00: cmd 61/10:10:d0:1f:cd/00:00:0c:00:00/40 tag 2 ncq dma 8192 out
Feb  5 01:02:30 VOID kernel:         res 40/00:01:01:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
Feb  5 01:02:30 VOID kernel: ata7.00: status: { DRDY }
Feb  5 01:02:30 VOID kernel: ata7.00: failed command: WRITE FPDMA QUEUED
Feb  5 01:02:30 VOID kernel: ata7.00: cmd 61/20:e8:40:a5:c5/00:00:05:00:00/40 tag 29 ncq dma 16384 out
Feb  5 01:02:30 VOID kernel:         res 40/00:01:01:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
Feb  5 01:02:30 VOID kernel: ata7.00: status: { DRDY }
Feb  5 01:02:30 VOID kernel: ata7.00: failed command: SEND FPDMA QUEUED
Feb  5 01:02:30 VOID kernel: ata7.00: cmd 64/01:f0:00:00:00/00:00:00:00:00/a0 tag 30 ncq dma 512 out
Feb  5 01:02:30 VOID kernel:         res 40/00:01:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Feb  5 01:02:30 VOID kernel: ata7.00: status: { DRDY }
Feb  5 01:02:30 VOID kernel: ata7: hard resetting link
Feb  5 01:02:30 VOID kernel: ata7: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
Feb  5 01:02:30 VOID kernel: ata7.00: supports DRM functions and may not be fully accessible
Feb  5 01:02:30 VOID kernel: ata7.00: supports DRM functions and may not be fully accessible
Feb  5 01:02:30 VOID kernel: ata7.00: configured for UDMA/133
Feb  5 01:02:30 VOID kernel: ata7: EH complete
Feb  5 01:02:30 VOID kernel: ata7.00: Enabling discard_zeroes_data
Feb  5 01:02:43 VOID kernel: BTRFS warning (device sdh1): failed to trim 1 block group(s), last error -5

 

void-diagnostics-20210205-0458.zip

 

Maybe a firmware update will help it get its sh*t together. putting the timeout error into google brings up lots of these:

https://bbs.archlinux.org/viewtopic.php?id=168530

https://askubuntu.com/questions/1154493/why-is-my-ssd-periodically-hanging-for-20-to-30-seconds

Edited by weirdcrap
Link to comment
  • weirdcrap changed the title to failed command: READ FPDMA QUEUED - bad new SSD?
5 hours ago, JorgeB said:

Worth trying if available.

Well crap, apparently there is no newer firmware available...

 

EDIT: https://bugzilla.kernel.org/show_bug.cgi?id=203475 Seems to be both TRIM and NCQ related. I TRIM on Sundays so this is almost certainly because of NCQ. Disabling NCQ tanks 4k random performance though which isn't ideal.

 

If I had known the 860 EVOs were going to be such trouble I would never have bought them...

 

I guess I should start buying a different brand of SSD.

 

How would I go about trying to disable NCQ in UnRAID for this disk as a test?

 

EDIT: alternatively, I have not yet tried this drive on the LSI. Loss of TRIM would obviously bypass that issue but what about NCQ? Does the LSI support it?

 

Edited by weirdcrap
Link to comment
On 2/5/2021 at 11:22 AM, JorgeB said:

It's not a general issue, I have several 860 EVO without issues, likely depends on the controller, but for you possibly better to try a different brand.

Well ideally I'd like to make this one work in this system as I'm stuck with it. I have moved it to the LSI controller to see if that helps. If not I may just keep this for an eventual gaming build (whenever the scalping and price crazyness stops) and buy a CrucialMX500. I was reading their marketing and they have TRIM at the firmware level which sounds interesting for scenarios where it may be on an LSI controller.

 

EDIT: Ok so far so good on the LSI.

 

EDIT: Well TRIM fails and throws an error but other than that things appear to be working.

Edited by weirdcrap
Link to comment
  • weirdcrap changed the title to [SOLVED] failed command: READ FPDMA QUEUED - bad new SSD?

@JorgeB So what are good 1TB SATA (Non M.2) SSDs that are known to work well with UnRAID without frustratingly weird firmware issues like my 860 EVO issue or apparently Crucial MX500's randomly reporting bogus bad sectors ?

 

I just bought an MX500 to put into my remote server as a replacement for my other aging 850 EVO. But if it's going to randomly report bad sectors I don't really want anything to do with that and I would rather return it for something else. 

 

Intel seems to have all but abandoned consumer line SSDs bigger than 512GB. I don't particularly want to pay a premium for a "data center quality" SATA SSD: https://www.intel.com/content/www/us/en/products/memory-storage/solid-state-drives/data-center-ssds.html

 

I've not had good experience with Kingstons and I don't really know much about the Western Digital line of SSDs.

 

EDIT: So I found the main topic for these crucial drives: 

So disabling the monitoring for 197 just prevents email/push alerts but it will still track and report bad sectors that stick around in the webui?

 

UnRAID only disables a disk if it fails a write test, so this bug shouldn't cause any sort of disabling issues, right?

 

 

Edited by weirdcrap
Link to comment
10 minutes ago, weirdcrap said:

So disabling the monitoring for 197 just prevents email/push alerts but it will still track and report bad sectors that stick around in the webui?

Yep, it's what I do.

 

11 minutes ago, weirdcrap said:

UnRAID only disables a disk if it fails a write test, so this bug shouldn't cause any sort of disabling issues, right?

Correct.

Link to comment
21 minutes ago, JorgeB said:

Yep, it's what I do.

 

Correct.

Well I guess I'll stick with the MX500 then, even though just ignoring the issue gives me a deep seeded (seated?) feeling of wrongness lol.

 

Maybe the SmartMonTools update you linked to will resolve this once and for all by adjusting the alerting behavior for this drive.

 

A question about the 860 EVO drive on my LSI and it's lack of TRIM: If I don't ever fill my SSD up (it generally only hovers around 100-200GB used) then will my lack of TRIM support have much of any noticeable performance and/or longevity effects?

 

I've been reading about the subject and people talk about the write amplification implications of not having TRIM or garbage collection for heavily utilized SSDs. However it sounds like if you have lots of blocks without needed data there shouldn't need to be a lot of shuffling around done by the controller firmware, right?

Edited by weirdcrap
Link to comment
  • 3 years later...
On 2/2/2021 at 9:50 AM, JorgeB said:

Samsung SSDs and older AMD chipsets don't usually go well, try connecting it to the LSI, you'll lose trim support but still worth trying.

 

You could try following kernel flag:

libata.force=3.0Gbps

 

It limits your SATA ports to 3Gbps instead of default 6Gbps and is known to fix lots of random issues with SSD drives on AMD systems. I think some AMD motherboards have a bit buggy SATA hardware which may fail if kernel is too fast to give more commands but when you limit the ports to 3Gbps mode, those ports are stable again.

 

Of course, this also cuts your max bandwidth per port from 600 MB/s to 300 MB/s. In most real world situations the difference isn't that great for SATA SSD devices, though.

 

Unfortunately, I don't know how to limit this workaround to motherboard SATA ports only, if you have additional SATA controller in your system.

 

Another option, if your motherboard supports it, is to slighly increase voltage on the motherboard. I have one system where SATA ports had random hangs and that was fixed by increasing the idle voltages on the motherboard. I guess motherboard voltage controller wasn't up to task and when BIOS controlled idle voltages dropped really low during idle (C1E...C6) and then kernel issued multiple SATA commands in parallel, the signaling voltage dropped (voltage ripple?) and Samsung SSDs started to make errors. I don't know if the motherboard voltage controller is bad or if the BIOS controlled voltage control simply has bad firmware implementation but increasing the voltage made the system totally stable and I haven't seen a single error since then. This system uses Intel SATA chips and I also tried the libata.force flag above but it didn't help with this system.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.