ZFS plugin for unRAID


steini84

Recommended Posts

Hello,

 

i have a Problem with my zpool.

In my pool (mirror) are one Samsung 850 EVO 1TB and one one 870 EVO (1TB). This two SSDs were connected to a LSI Controler with IT Mode (Version 19).
Few days before i had some errors on one SSD (Samsung 870 EVO) in the syslog and also some UDMA CRC Errors and Hardware ECC recovered.

Then i have disconnected both SSDs from the HBA, connected the Samsung 850 EVO 1TB and a brandnew 870 EVO (1TB) with new SATA cables to the SATA onboard Connectors to my HP Proliant ML310e Gen8 Version 1 (SATA II not III). The new 870 EVO (1TB) i have replaced and resilvered in the zpool and everything was fine.

 

Since yesterday during or after and today after a reboot and starting of the docker-services and some dockers i have again some errors with the new 870 EVO (1TB) in the syslog and write errors in the zpool showing with "zpool status". Then i have made a zpool scrub, during this scrub some cksum error in zpool status comes up. The i have made zpool clear and all errors are gone. After this i have made a new scrub and had no new error (read/write or cksum). The SMART values for the new SSD 850 EVO 1TB showing okay, no CRC or other errors.

The server is now running since a few hours and in the syslog comes no new errors from the SSD. I dare not to restart the server because otherwise new errors could occur.

 

This is an excerpt of the syslog from a reboot this afternoon, /dev/sdg is the brandnew replaced Samsung 850 EVO 1TB:

Nov  7 16:39:45 Avalon kernel: ata5.00: exception Emask 0x0 SAct 0xfd901f SErr 0x0 action 0x6 frozen
Nov  7 16:39:45 Avalon kernel: ata5.00: failed command: SEND FPDMA QUEUED
Nov  7 16:39:45 Avalon kernel: ata5.00: cmd 64/01:00:00:00:00/00:00:00:00:00/a0 tag 0 ncq dma 512 out
Nov  7 16:39:45 Avalon kernel:         res 40/00:01:e0:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
Nov  7 16:39:45 Avalon kernel: ata5.00: status: { DRDY }
Nov  7 16:39:45 Avalon kernel: ata5.00: failed command: SEND FPDMA QUEUED
Nov  7 16:39:45 Avalon kernel: ata5.00: cmd 64/01:08:00:00:00/00:00:00:00:00/a0 tag 1 ncq dma 512 out
Nov  7 16:39:45 Avalon kernel:         res 40/00:01:e0:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
Nov  7 16:39:45 Avalon kernel: ata5.00: status: { DRDY }
Nov  7 16:39:45 Avalon kernel: ata5.00: failed command: WRITE FPDMA QUEUED
Nov  7 16:39:45 Avalon kernel: ata5.00: cmd 61/08:10:4b:ab:0c/00:00:0e:00:00/40 tag 2 ncq dma 4096 out
Nov  7 16:39:45 Avalon kernel:         res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Nov  7 16:39:45 Avalon kernel: ata5.00: status: { DRDY }
Nov  7 16:39:45 Avalon kernel: ata5.00: failed command: WRITE FPDMA QUEUED
Nov  7 16:39:45 Avalon kernel: ata5.00: cmd 61/44:18:5b:83:03/00:00:0e:00:00/40 tag 3 ncq dma 34816 out
Nov  7 16:39:45 Avalon kernel:         res 40/00:01:e0:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
Nov  7 16:39:45 Avalon kernel: ata5.00: status: { DRDY }
Nov  7 16:39:45 Avalon kernel: ata5.00: failed command: WRITE FPDMA QUEUED
Nov  7 16:39:45 Avalon kernel: ata5.00: cmd 61/44:20:73:bb:03/00:00:0e:00:00/40 tag 4 ncq dma 34816 out
Nov  7 16:39:45 Avalon kernel:         res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Nov  7 16:39:45 Avalon kernel: ata5.00: status: { DRDY }
Nov  7 16:39:45 Avalon kernel: ata5.00: failed command: WRITE FPDMA QUEUED
Nov  7 16:39:45 Avalon kernel: ata5.00: cmd 61/23:60:06:40:00/00:00:1c:00:00/40 tag 12 ncq dma 17920 out
Nov  7 16:39:45 Avalon kernel:         res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Nov  7 16:39:45 Avalon kernel: ata5.00: status: { DRDY }
Nov  7 16:39:45 Avalon kernel: ata5.00: failed command: WRITE FPDMA QUEUED
Nov  7 16:39:45 Avalon kernel: ata5.00: cmd 61/42:78:31:bb:03/00:00:0e:00:00/40 tag 15 ncq dma 33792 out
Nov  7 16:39:45 Avalon kernel:         res 40/00:01:e0:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
Nov  7 16:39:45 Avalon kernel: ata5.00: status: { DRDY }
Nov  7 16:39:45 Avalon kernel: ata5.00: failed command: WRITE FPDMA QUEUED
Nov  7 16:39:45 Avalon kernel: ata5.00: cmd 61/4d:80:b7:bb:03/00:00:0e:00:00/40 tag 16 ncq dma 39424 out
Nov  7 16:39:45 Avalon kernel:         res 40/00:01:e0:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
Nov  7 16:39:45 Avalon kernel: ata5.00: status: { DRDY }
Nov  7 16:39:45 Avalon kernel: ata5.00: failed command: WRITE FPDMA QUEUED
Nov  7 16:39:45 Avalon kernel: ata5.00: cmd 61/11:90:8f:b2:03/00:00:0e:00:00/40 tag 18 ncq dma 8704 out
Nov  7 16:39:45 Avalon kernel:         res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Nov  7 16:39:45 Avalon kernel: ata5.00: status: { DRDY }
Nov  7 16:39:45 Avalon kernel: ata5.00: failed command: WRITE FPDMA QUEUED
Nov  7 16:39:45 Avalon kernel: ata5.00: cmd 61/08:98:8d:f5:02/00:00:1c:00:00/40 tag 19 ncq dma 4096 out
Nov  7 16:39:45 Avalon kernel:         res 40/00:01:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Nov  7 16:39:45 Avalon kernel: ata5.00: status: { DRDY }
Nov  7 16:39:45 Avalon kernel: ata5.00: failed command: WRITE FPDMA QUEUED
Nov  7 16:39:45 Avalon kernel: ata5.00: cmd 61/70:a0:9a:9f:03/00:00:1c:00:00/40 tag 20 ncq dma 57344 out
Nov  7 16:39:45 Avalon kernel:         res 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
Nov  7 16:39:45 Avalon kernel: ata5.00: status: { DRDY }
Nov  7 16:39:45 Avalon kernel: ata5.00: failed command: WRITE FPDMA QUEUED
Nov  7 16:39:45 Avalon kernel: ata5.00: cmd 61/18:a8:0a:a0:03/00:00:1c:00:00/40 tag 21 ncq dma 12288 out
Nov  7 16:39:45 Avalon kernel:         res 40/00:01:01:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
Nov  7 16:39:45 Avalon kernel: ata5.00: status: { DRDY }
Nov  7 16:39:45 Avalon kernel: ata5.00: failed command: WRITE FPDMA QUEUED
Nov  7 16:39:45 Avalon kernel: ata5.00: cmd 61/42:b0:de:90:03/00:00:0e:00:00/40 tag 22 ncq dma 33792 out
Nov  7 16:39:45 Avalon kernel:         res 40/00:01:09:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
Nov  7 16:39:45 Avalon kernel: ata5.00: status: { DRDY }
Nov  7 16:39:45 Avalon kernel: ata5.00: failed command: WRITE FPDMA QUEUED
Nov  7 16:39:45 Avalon kernel: ata5.00: cmd 61/30:b8:ab:6a:03/00:00:0e:00:00/40 tag 23 ncq dma 24576 out
Nov  7 16:39:45 Avalon kernel:         res 40/00:01:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Nov  7 16:39:45 Avalon kernel: ata5.00: status: { DRDY }
Nov  7 16:39:45 Avalon kernel: ata5: hard resetting link
Nov  7 16:39:45 Avalon kernel: ata5: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Nov  7 16:39:45 Avalon kernel: ata5.00: supports DRM functions and may not be fully accessible
Nov  7 16:39:45 Avalon kernel: ata5.00: supports DRM functions and may not be fully accessible
Nov  7 16:39:45 Avalon kernel: ata5.00: configured for UDMA/133
Nov  7 16:39:45 Avalon kernel: sd 6:0:0:0: [sdg] tag#2 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08 cmd_age=42s
Nov  7 16:39:45 Avalon kernel: sd 6:0:0:0: [sdg] tag#2 Sense Key : 0x5 [current] 
Nov  7 16:39:45 Avalon kernel: sd 6:0:0:0: [sdg] tag#2 ASC=0x21 ASCQ=0x4 
Nov  7 16:39:45 Avalon kernel: sd 6:0:0:0: [sdg] tag#2 CDB: opcode=0x2a 2a 00 0e 0c ab 4b 00 00 08 00
Nov  7 16:39:45 Avalon kernel: blk_update_request: I/O error, dev sdg, sector 235711307 op 0x1:(WRITE) flags 0x700 phys_seg 1 prio class 0
Nov  7 16:39:45 Avalon kernel: zio pool=zpool vdev=/dev/disk/by-id/ata-Samsung_SSD_870_EVO_1TB_S626NF0R226283M-part1 error=5 type=2 offset=120683140608 size=4096 flags=180880
Nov  7 16:39:45 Avalon kernel: sd 6:0:0:0: [sdg] tag#3 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08 cmd_age=30s
Nov  7 16:39:45 Avalon kernel: sd 6:0:0:0: [sdg] tag#3 Sense Key : 0x5 [current] 
Nov  7 16:39:45 Avalon kernel: sd 6:0:0:0: [sdg] tag#3 ASC=0x21 ASCQ=0x4 
Nov  7 16:39:45 Avalon kernel: sd 6:0:0:0: [sdg] tag#3 CDB: opcode=0x2a 2a 00 0e 03 83 5b 00 00 44 00
Nov  7 16:39:45 Avalon kernel: blk_update_request: I/O error, dev sdg, sector 235111259 op 0x1:(WRITE) flags 0x700 phys_seg 1 prio class 0
Nov  7 16:39:45 Avalon kernel: zio pool=zpool vdev=/dev/disk/by-id/ata-Samsung_SSD_870_EVO_1TB_S626NF0R226283M-part1 error=5 type=2 offset=120375916032 size=34816 flags=180880
Nov  7 16:39:45 Avalon kernel: sd 6:0:0:0: [sdg] tag#4 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08 cmd_age=30s
Nov  7 16:39:45 Avalon kernel: sd 6:0:0:0: [sdg] tag#4 Sense Key : 0x5 [current] 
Nov  7 16:39:45 Avalon kernel: sd 6:0:0:0: [sdg] tag#4 ASC=0x21 ASCQ=0x4 
Nov  7 16:39:45 Avalon kernel: sd 6:0:0:0: [sdg] tag#4 CDB: opcode=0x2a 2a 00 0e 03 bb 73 00 00 44 00
Nov  7 16:39:45 Avalon kernel: blk_update_request: I/O error, dev sdg, sector 235125619 op 0x1:(WRITE) flags 0x700 phys_seg 1 prio class 0
Nov  7 16:39:45 Avalon kernel: zio pool=zpool vdev=/dev/disk/by-id/ata-Samsung_SSD_870_EVO_1TB_S626NF0R226283M-part1 error=5 type=2 offset=120383268352 size=34816 flags=180880
Nov  7 16:39:45 Avalon kernel: sd 6:0:0:0: [sdg] tag#12 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08 cmd_age=42s
Nov  7 16:39:45 Avalon kernel: sd 6:0:0:0: [sdg] tag#12 Sense Key : 0x5 [current] 
Nov  7 16:39:45 Avalon kernel: sd 6:0:0:0: [sdg] tag#12 ASC=0x21 ASCQ=0x4 
Nov  7 16:39:45 Avalon kernel: sd 6:0:0:0: [sdg] tag#12 CDB: opcode=0x2a 2a 00 1c 00 40 06 00 00 23 00
Nov  7 16:39:45 Avalon kernel: blk_update_request: I/O error, dev sdg, sector 469778438 op 0x1:(WRITE) flags 0x700 phys_seg 1 prio class 0
Nov  7 16:39:45 Avalon kernel: zio pool=zpool vdev=/dev/disk/by-id/ata-Samsung_SSD_870_EVO_1TB_S626NF0R226283M-part1 error=5 type=2 offset=240525511680 size=17920 flags=180880
Nov  7 16:39:45 Avalon kernel: sd 6:0:0:0: [sdg] tag#15 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08 cmd_age=30s
Nov  7 16:39:45 Avalon kernel: sd 6:0:0:0: [sdg] tag#15 Sense Key : 0x5 [current] 
Nov  7 16:39:45 Avalon kernel: sd 6:0:0:0: [sdg] tag#15 ASC=0x21 ASCQ=0x4 
Nov  7 16:39:45 Avalon kernel: sd 6:0:0:0: [sdg] tag#15 CDB: opcode=0x2a 2a 00 0e 03 bb 31 00 00 42 00
Nov  7 16:39:45 Avalon kernel: blk_update_request: I/O error, dev sdg, sector 235125553 op 0x1:(WRITE) flags 0x700 phys_seg 1 prio class 0
Nov  7 16:39:45 Avalon kernel: zio pool=zpool vdev=/dev/disk/by-id/ata-Samsung_SSD_870_EVO_1TB_S626NF0R226283M-part1 error=5 type=2 offset=120383234560 size=33792 flags=180880
Nov  7 16:39:45 Avalon kernel: sd 6:0:0:0: [sdg] tag#16 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08 cmd_age=30s
Nov  7 16:39:45 Avalon kernel: sd 6:0:0:0: [sdg] tag#16 Sense Key : 0x5 [current] 
Nov  7 16:39:45 Avalon kernel: sd 6:0:0:0: [sdg] tag#16 ASC=0x21 ASCQ=0x4 
Nov  7 16:39:45 Avalon kernel: sd 6:0:0:0: [sdg] tag#16 CDB: opcode=0x2a 2a 00 0e 03 bb b7 00 00 4d 00
Nov  7 16:39:45 Avalon kernel: blk_update_request: I/O error, dev sdg, sector 235125687 op 0x1:(WRITE) flags 0x700 phys_seg 1 prio class 0
Nov  7 16:39:45 Avalon kernel: zio pool=zpool vdev=/dev/disk/by-id/ata-Samsung_SSD_870_EVO_1TB_S626NF0R226283M-part1 error=5 type=2 offset=120383303168 size=39424 flags=180880
Nov  7 16:39:45 Avalon kernel: sd 6:0:0:0: [sdg] tag#18 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08 cmd_age=42s
Nov  7 16:39:45 Avalon kernel: sd 6:0:0:0: [sdg] tag#18 Sense Key : 0x5 [current] 
Nov  7 16:39:45 Avalon kernel: sd 6:0:0:0: [sdg] tag#18 ASC=0x21 ASCQ=0x4 
Nov  7 16:39:45 Avalon kernel: sd 6:0:0:0: [sdg] tag#18 CDB: opcode=0x2a 2a 00 0e 03 b2 8f 00 00 11 00
Nov  7 16:39:45 Avalon kernel: blk_update_request: I/O error, dev sdg, sector 235123343 op 0x1:(WRITE) flags 0x700 phys_seg 2 prio class 0
Nov  7 16:39:45 Avalon kernel: zio pool=zpool vdev=/dev/disk/by-id/ata-Samsung_SSD_870_EVO_1TB_S626NF0R226283M-part1 error=5 type=2 offset=120382103040 size=8704 flags=180880
Nov  7 16:39:45 Avalon kernel: sd 6:0:0:0: [sdg] tag#19 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08 cmd_age=35s
Nov  7 16:39:45 Avalon kernel: sd 6:0:0:0: [sdg] tag#19 Sense Key : 0x5 [current] 
Nov  7 16:39:45 Avalon kernel: sd 6:0:0:0: [sdg] tag#19 ASC=0x21 ASCQ=0x4 
Nov  7 16:39:45 Avalon kernel: sd 6:0:0:0: [sdg] tag#19 CDB: opcode=0x2a 2a 00 1c 02 f5 8d 00 00 08 00
Nov  7 16:39:45 Avalon kernel: blk_update_request: I/O error, dev sdg, sector 469955981 op 0x1:(WRITE) flags 0x700 phys_seg 1 prio class 0
Nov  7 16:39:45 Avalon kernel: zio pool=zpool vdev=/dev/disk/by-id/ata-Samsung_SSD_870_EVO_1TB_S626NF0R226283M-part1 error=5 type=2 offset=240616413696 size=4096 flags=180880
Nov  7 16:39:45 Avalon kernel: sd 6:0:0:0: [sdg] tag#20 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08 cmd_age=35s
Nov  7 16:39:45 Avalon kernel: sd 6:0:0:0: [sdg] tag#20 Sense Key : 0x5 [current] 
Nov  7 16:39:45 Avalon kernel: sd 6:0:0:0: [sdg] tag#20 ASC=0x21 ASCQ=0x4 
Nov  7 16:39:45 Avalon kernel: sd 6:0:0:0: [sdg] tag#20 CDB: opcode=0x2a 2a 00 1c 03 9f 9a 00 00 70 00
Nov  7 16:39:45 Avalon kernel: blk_update_request: I/O error, dev sdg, sector 469999514 op 0x1:(WRITE) flags 0x700 phys_seg 14 prio class 0
Nov  7 16:39:45 Avalon kernel: zio pool=zpool vdev=/dev/disk/by-id/ata-Samsung_SSD_870_EVO_1TB_S626NF0R226283M-part1 error=5 type=2 offset=240638702592 size=57344 flags=40080c80
Nov  7 16:39:45 Avalon kernel: sd 6:0:0:0: [sdg] tag#21 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08 cmd_age=35s
Nov  7 16:39:45 Avalon kernel: sd 6:0:0:0: [sdg] tag#21 Sense Key : 0x5 [current] 
Nov  7 16:39:45 Avalon kernel: sd 6:0:0:0: [sdg] tag#21 ASC=0x21 ASCQ=0x4 
Nov  7 16:39:45 Avalon kernel: sd 6:0:0:0: [sdg] tag#21 CDB: opcode=0x2a 2a 00 1c 03 a0 0a 00 00 18 00
Nov  7 16:39:45 Avalon kernel: blk_update_request: I/O error, dev sdg, sector 469999626 op 0x1:(WRITE) flags 0x700 phys_seg 1 prio class 0
Nov  7 16:39:45 Avalon kernel: zio pool=zpool vdev=/dev/disk/by-id/ata-Samsung_SSD_870_EVO_1TB_S626NF0R226283M-part1 error=5 type=2 offset=240638759936 size=12288 flags=180880
Nov  7 16:39:45 Avalon kernel: zio pool=zpool vdev=/dev/disk/by-id/ata-Samsung_SSD_870_EVO_1TB_S626NF0R226283M-part1 error=5 type=2 offset=120377687040 size=33792 flags=180880
Nov  7 16:39:45 Avalon kernel: zio pool=zpool vdev=/dev/disk/by-id/ata-Samsung_SSD_870_EVO_1TB_S626NF0R226283M-part1 error=5 type=2 offset=120372680192 size=24576 flags=180880
Nov  7 16:39:45 Avalon kernel: ata5: EH complete
Nov  7 16:39:45 Avalon kernel: ata5.00: Enabling discard_zeroes_data

 

Actual SMART Values of the brandnew replaced Samsung 850 EVO 1TB after reboot and the error in the syslog:

#	Attribute Name	Flag	Value	Worst	Threshold	Type	Updated	Failed	Raw Value
5	Reallocated sector count	0x0033	100	100	010	Pre-fail	Always	Never	0
9	Power on hours	0x0032	099	099	000	Old age	Always	Never	32 (1d, 8h)
12	Power cycle count	0x0032	099	099	000	Old age	Always	Never	10
177	Wear leveling count	0x0013	099	099	000	Pre-fail	Always	Never	1
179	Used rsvd block count tot	0x0013	100	100	010	Pre-fail	Always	Never	0
181	Program fail count total	0x0032	100	100	010	Old age	Always	Never	0
182	Erase fail count total	0x0032	100	100	010	Old age	Always	Never	0
183	Runtime bad block	0x0013	100	100	010	Pre-fail	Always	Never	0
187	Reported uncorrect	0x0032	100	100	000	Old age	Always	Never	0
190	Airflow temperature cel	0x0032	075	062	000	Old age	Always	Never	25
195	Hardware ECC recovered	0x001a	200	200	000	Old age	Always	Never	0
199	UDMA CRC error count	0x003e	100	100	000	Old age	Always	Never	0
235	Unknown attribute	0x0012	099	099	000	Old age	Always	Never	5
241	Total lbas written	0x0032	099	099	000	Old age	Always	Never	291524814

 

Has anyone any idea what is here the problem? Is the brandnew drive faulty or have i a problem with the zpool?

 

root@Avalon:~# zpool status
  pool: zpool
 state: ONLINE
  scan: scrub repaired 0B in 00:06:59 with 0 errors on Sun Nov  7 17:05:41 2021
config:

        NAME                                             STATE     READ WRITE CKSUM
        zpool                                            ONLINE       0     0     0
          mirror-0                                       ONLINE       0     0     0
            ata-Samsung_SSD_850_EVO_1TB_S2RFNX0HA28280F  ONLINE       0     0     0
            ata-Samsung_SSD_870_EVO_1TB_S626NF0R226283M  ONLINE       0     0     0

errors: No known data errors

 

Thanks

Chris

 

  • Like 1
Link to comment
6 hours ago, Shantarius said:

Has anyone any idea what is here the problem? Is the brandnew drive faulty or have i a problem with the zpool?

 

 

Could be bad cable?

 

More likely, you might need to disable NCQ in Unraid Settings | Disk:

 

@MikkoRantalainen Apparently some users have the issue only with queued TRIM, some need to disable NCQ entirely. It depends if you have "SEND FPDMA QUEUED" (queued TRIM only) or "WRITE FPDMA QUEUED" (NCQ in general). bugzilla.kernel.org/show_bug.cgi?id=203475#c15 Me personally, I get the latter, and the only fix is to disable NCQ entirely via kernel command line. (Samsung 860 EVO + AMD FX 970 chipset) – 
NateDev
 Jun 12 at 3:10

https://unix.stackexchange.com/questions/623238/root-causes-for-failed-command-write-fpdma-queued

 

Keep in mind that LSI Controllers don't support TRIM, so you're going to have performance issues with your SSDs as time passes.  Use onboard SATA ports if possible.  I have my LSI firmware on P16 as that's the latest version is the latest that works with TRIM.

 

edit:

 

UDMA CRC Errors

 

These are most commonly cable issues I think?

Edited by jortan
Link to comment

Hi,

thank you for your answer!

I have removed the LSI controler and use now the SSDs with brandnew SATA cables with the Mainboards SATA II Controler. After i have removed the LSI controler i have changed a old SSD with a brandnew EVO870 with which the errors occur again. Nevertheless with the new SSD the errors coming up after a reboot, but over night running with dockers and vms there was no errors.

 

Today i have made an experiment after i have turned off NCQ to show the status:

 

1. Before reboot

root@Avalon:/mnt/zpool/Docker/Telegraf# zpool status
  pool: zpool
 state: ONLINE
  scan: scrub repaired 0B in 00:06:59 with 0 errors on Sun Nov  7 17:05:41 2021
config:

        NAME                                             STATE     READ WRITE CKSUM
        zpool                                            ONLINE       0     0     0
          mirror-0                                       ONLINE       0     0     0
            ata-Samsung_SSD_850_EVO_1TB_S2RFNX0HA28280F  ONLINE       0     0     0
            ata-Samsung_SSD_870_EVO_1TB_S626NF0R226283M  ONLINE       0     0     0

errors: No known data errors


/dev/sdg (EVO 870 new SSD)
#	Attribute Name	Flag	Value	Worst	Threshold	Type	Updated	Failed	Raw Value
5	Reallocated sector count	0x0033	100	100	010	Pre-fail	Always	Never	0
9	Power on hours	0x0032	099	099	000	Old age	Always	Never	52 (2d, 4h)
12	Power cycle count	0x0032	099	099	000	Old age	Always	Never	10
177	Wear leveling count	0x0013	099	099	000	Pre-fail	Always	Never	2
179	Used rsvd block count tot	0x0013	100	100	010	Pre-fail	Always	Never	0
181	Program fail count total	0x0032	100	100	010	Old age	Always	Never	0
182	Erase fail count total	0x0032	100	100	010	Old age	Always	Never	0
183	Runtime bad block	0x0013	100	100	010	Pre-fail	Always	Never	0
187	Reported uncorrect	0x0032	100	100	000	Old age	Always	Never	0
190	Airflow temperature cel	0x0032	074	062	000	Old age	Always	Never	26
195	Hardware ECC recovered	0x001a	200	200	000	Old age	Always	Never	0
199	UDMA CRC error count	0x003e	100	100	000	Old age	Always	Never	0
235	Unknown attribute	0x0012	099	099	000	Old age	Always	Never	5
241	Total lbas written	0x0032	099	099	000	Old age	Always	Never	347926153

 

2. After reboot, before starting the array

 

After reboot, before starting Array:

root@Avalon:~# zpool status
  pool: zpool
 state: ONLINE
  scan: scrub repaired 0B in 00:06:59 with 0 errors on Sun Nov  7 17:05:41 2021
config:

        NAME                                             STATE     READ WRITE CKSUM
        zpool                                            ONLINE       0     0     0
          mirror-0                                       ONLINE       0     0     0
            ata-Samsung_SSD_850_EVO_1TB_S2RFNX0HA28280F  ONLINE       0     0     0
            ata-Samsung_SSD_870_EVO_1TB_S626NF0R226283M  ONLINE       0     0     0

errors: No known data errors
root@Avalon:~#

/dev/sdg (EVO 870 new SSD)
#	Attribute Name	Flag	Value	Worst	Threshold	Type	Updated	Failed	Raw Value
5	Reallocated sector count	0x0033	100	100	010	Pre-fail	Always	Never	0
9	Power on hours	0x0032	099	099	000	Old age	Always	Never	52 (2d, 4h)
12	Power cycle count	0x0032	099	099	000	Old age	Always	Never	10
177	Wear leveling count	0x0013	099	099	000	Pre-fail	Always	Never	2
179	Used rsvd block count tot	0x0013	100	100	010	Pre-fail	Always	Never	0
181	Program fail count total	0x0032	100	100	010	Old age	Always	Never	0
182	Erase fail count total	0x0032	100	100	010	Old age	Always	Never	0
183	Runtime bad block	0x0013	100	100	010	Pre-fail	Always	Never	0
187	Reported uncorrect	0x0032	100	100	000	Old age	Always	Never	0
190	Airflow temperature cel	0x0032	078	062	000	Old age	Always	Never	22
195	Hardware ECC recovered	0x001a	200	200	000	Old age	Always	Never	0
199	UDMA CRC error count	0x003e	100	100	000	Old age	Always	Never	0
235	Unknown attribute	0x0012	099	099	000	Old age	Always	Never	5
241	Total lbas written	0x0032	099	099	000	Old age	Always	Never	347946967

 

3. After reboot and after a few minutes after the Array has started

 

root@Avalon:/mnt/zpool# zpool status
  pool: zpool
 state: ONLINE
status: One or more devices has experienced an unrecoverable error.  An
        attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
        using 'zpool clear' or replace the device with 'zpool replace'.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-9P
  scan: scrub repaired 0B in 00:06:59 with 0 errors on Sun Nov  7 17:05:41 2021
config:

        NAME                                             STATE     READ WRITE CKSUM
        zpool                                            ONLINE       0     0     0
          mirror-0                                       ONLINE       0     0     0
            ata-Samsung_SSD_850_EVO_1TB_S2RFNX0HA28280F  ONLINE       0     0     0
            ata-Samsung_SSD_870_EVO_1TB_S626NF0R226283M  ONLINE       0    19     0

errors: No known data errors
root@Avalon:/mnt/zpool#

/dev/sdg (EVO 870 new SSD)
#	Attribute Name	Flag	Value	Worst	Threshold	Type	Updated	Failed	Raw Value
5	Reallocated sector count	0x0033	100	100	010	Pre-fail	Always	Never	0
9	Power on hours	0x0032	099	099	000	Old age	Always	Never	52 (2d, 4h)
12	Power cycle count	0x0032	099	099	000	Old age	Always	Never	10
177	Wear leveling count	0x0013	099	099	000	Pre-fail	Always	Never	2
179	Used rsvd block count tot	0x0013	100	100	010	Pre-fail	Always	Never	0
181	Program fail count total	0x0032	100	100	010	Old age	Always	Never	0
182	Erase fail count total	0x0032	100	100	010	Old age	Always	Never	0
183	Runtime bad block	0x0013	100	100	010	Pre-fail	Always	Never	0
187	Reported uncorrect	0x0032	100	100	000	Old age	Always	Never	0
190	Airflow temperature cel	0x0032	078	062	000	Old age	Always	Never	22
195	Hardware ECC recovered	0x001a	200	200	000	Old age	Always	Never	0
199	UDMA CRC error count	0x003e	100	100	000	Old age	Always	Never	0
235	Unknown attribute	0x0012	099	099	000	Old age	Always	Never	5
241	Total lbas written	0x0032	099	099	000	Old age	Always	Never	348067881

root@Avalon:/mnt/zpool# cat /var/log/syslog | grep 16:16:37
Nov  8 16:16:37 Avalon kernel: ata5.00: exception Emask 0x0 SAct 0x1e018 SErr 0x0 action 0x6 frozen
Nov  8 16:16:37 Avalon kernel: ata5.00: failed command: WRITE FPDMA QUEUED
Nov  8 16:16:37 Avalon kernel: ata5.00: cmd 61/04:18:06:6a:00/00:00:0c:00:00/40 tag 3 ncq dma 2048 out
Nov  8 16:16:37 Avalon kernel:         res 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
Nov  8 16:16:37 Avalon kernel: ata5.00: status: { DRDY }
Nov  8 16:16:37 Avalon kernel: ata5.00: failed command: WRITE FPDMA QUEUED
Nov  8 16:16:37 Avalon kernel: ata5.00: cmd 61/36:20:57:6a:00/00:00:0c:00:00/40 tag 4 ncq dma 27648 out
Nov  8 16:16:37 Avalon kernel:         res 40/00:01:01:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
Nov  8 16:16:37 Avalon kernel: ata5.00: status: { DRDY }
Nov  8 16:16:37 Avalon kernel: ata5.00: failed command: WRITE FPDMA QUEUED
Nov  8 16:16:37 Avalon kernel: ata5.00: cmd 61/10:68:10:26:70/00:00:74:00:00/40 tag 13 ncq dma 8192 out
Nov  8 16:16:37 Avalon kernel:         res 40/00:01:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Nov  8 16:16:37 Avalon kernel: ata5.00: status: { DRDY }
Nov  8 16:16:37 Avalon kernel: ata5.00: failed command: WRITE FPDMA QUEUED
Nov  8 16:16:37 Avalon kernel: ata5.00: cmd 61/10:70:10:24:70/00:00:74:00:00/40 tag 14 ncq dma 8192 out
Nov  8 16:16:37 Avalon kernel:         res 40/00:01:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Nov  8 16:16:37 Avalon kernel: ata5.00: status: { DRDY }
Nov  8 16:16:37 Avalon kernel: ata5.00: failed command: WRITE FPDMA QUEUED
Nov  8 16:16:37 Avalon kernel: ata5.00: cmd 61/10:78:10:0a:00/00:00:00:00:00/40 tag 15 ncq dma 8192 out
Nov  8 16:16:37 Avalon kernel:         res 40/00:ff:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Nov  8 16:16:37 Avalon kernel: ata5.00: status: { DRDY }
Nov  8 16:16:37 Avalon kernel: ata5.00: failed command: SEND FPDMA QUEUED
Nov  8 16:16:37 Avalon kernel: ata5.00: cmd 64/01:80:00:00:00/00:00:00:00:00/a0 tag 16 ncq dma 512 out
Nov  8 16:16:37 Avalon kernel:         res 40/00:01:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Nov  8 16:16:37 Avalon kernel: ata5.00: status: { DRDY }
Nov  8 16:16:37 Avalon kernel: ata5: hard resetting link
Nov  8 16:16:37 Avalon kernel: ata5: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Nov  8 16:16:37 Avalon kernel: ata5.00: supports DRM functions and may not be fully accessible
Nov  8 16:16:37 Avalon kernel: ata5.00: supports DRM functions and may not be fully accessible
Nov  8 16:16:37 Avalon kernel: ata5.00: configured for UDMA/133
Nov  8 16:16:37 Avalon kernel: sd 6:0:0:0: [sdg] tag#3 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08 cmd_age=30s
Nov  8 16:16:37 Avalon kernel: sd 6:0:0:0: [sdg] tag#3 Sense Key : 0x5 [current]
Nov  8 16:16:37 Avalon kernel: sd 6:0:0:0: [sdg] tag#3 ASC=0x21 ASCQ=0x4
Nov  8 16:16:37 Avalon kernel: sd 6:0:0:0: [sdg] tag#3 CDB: opcode=0x2a 2a 00 0c 00 6a 06 00 00 04 00
Nov  8 16:16:37 Avalon kernel: blk_update_request: I/O error, dev sdg, sector 201353734 op 0x1:(WRITE) flags 0x700 phys_seg 2 prio class 0
Nov  8 16:16:37 Avalon kernel: zio pool=zpool vdev=/dev/disk/by-id/ata-Samsung_SSD_870_EVO_1TB_S626NF0R226283M-part1 error=5 type=2 offset=103092063232 size=2048 flags=40080c80
Nov  8 16:16:37 Avalon kernel: sd 6:0:0:0: [sdg] tag#4 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08 cmd_age=30s
Nov  8 16:16:37 Avalon kernel: sd 6:0:0:0: [sdg] tag#4 Sense Key : 0x5 [current]
Nov  8 16:16:37 Avalon kernel: sd 6:0:0:0: [sdg] tag#4 ASC=0x21 ASCQ=0x4
Nov  8 16:16:37 Avalon kernel: sd 6:0:0:0: [sdg] tag#4 CDB: opcode=0x2a 2a 00 0c 00 6a 57 00 00 36 00
Nov  8 16:16:37 Avalon kernel: blk_update_request: I/O error, dev sdg, sector 201353815 op 0x1:(WRITE) flags 0x700 phys_seg 1 prio class 0
Nov  8 16:16:37 Avalon kernel: zio pool=zpool vdev=/dev/disk/by-id/ata-Samsung_SSD_870_EVO_1TB_S626NF0R226283M-part1 error=5 type=2 offset=103092104704 size=27648 flags=180880
Nov  8 16:16:37 Avalon kernel: ata5: EH complete
Nov  8 16:16:37 Avalon kernel: ata5.00: Enabling discard_zeroes_data
root@Avalon:/mnt/zpool#

 

4. After starting the docker service and the vm service (dockers and vms are on the zpool) no new errors.

 

5. After zpool scrub no new errors in syslog and zpool status

 

root@Avalon:/mnt/zpool# zpool status -v zpool
  pool: zpool
 state: ONLINE
status: One or more devices has experienced an unrecoverable error.  An
        attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
        using 'zpool clear' or replace the device with 'zpool replace'.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-9P
  scan: scrub repaired 0B in 00:05:29 with 0 errors on Mon Nov  8 16:34:53 2021
config:

        NAME                                             STATE     READ WRITE CKSUM
        zpool                                            ONLINE       0     0     0
          mirror-0                                       ONLINE       0     0     0
            ata-Samsung_SSD_850_EVO_1TB_S2RFNX0HA28280F  ONLINE       0     0     0
            ata-Samsung_SSD_870_EVO_1TB_S626NF0R226283M  ONLINE       0    19     0

errors: No known data errors
root@Avalon:/mnt/zpool#

/dev/sdg (EVO 870 new SSD)
#	Attribute Name	Flag	Value	Worst	Threshold	Type	Updated	Failed	Raw Value
5	Reallocated sector count	0x0033	100	100	010	Pre-fail	Always	Never	0
9	Power on hours	0x0032	099	099	000	Old age	Always	Never	52 (2d, 4h)
12	Power cycle count	0x0032	099	099	000	Old age	Always	Never	10
177	Wear leveling count	0x0013	099	099	000	Pre-fail	Always	Never	2
179	Used rsvd block count tot	0x0013	100	100	010	Pre-fail	Always	Never	0
181	Program fail count total	0x0032	100	100	010	Old age	Always	Never	0
182	Erase fail count total	0x0032	100	100	010	Old age	Always	Never	0
183	Runtime bad block	0x0013	100	100	010	Pre-fail	Always	Never	0
187	Reported uncorrect	0x0032	100	100	000	Old age	Always	Never	0
190	Airflow temperature cel	0x0032	073	062	000	Old age	Always	Never	27
195	Hardware ECC recovered	0x001a	200	200	000	Old age	Always	Never	0
199	UDMA CRC error count	0x003e	100	100	000	Old age	Always	Never	0
235	Unknown attribute	0x0012	099	099	000	Old age	Always	Never	5
241	Total lbas written	0x0032	099	099	000	Old age	Always	Never	349169583

 

6. After zpool clear
 

root@Avalon:/mnt/zpool# zpool status -v zpool
  pool: zpool
 state: ONLINE
  scan: scrub repaired 0B in 00:05:14 with 0 errors on Mon Nov  8 16:42:34 2021
config:

        NAME                                             STATE     READ WRITE CKSUM
        zpool                                            ONLINE       0     0     0
          mirror-0                                       ONLINE       0     0     0
            ata-Samsung_SSD_850_EVO_1TB_S2RFNX0HA28280F  ONLINE       0     0     0
            ata-Samsung_SSD_870_EVO_1TB_S626NF0R226283M  ONLINE       0     0     0

errors: No known data errors
root@Avalon:/mnt/zpool#

 

What is meant with "ata5: hard resetting link" in the syslog?

 

Is it possible that the "vdev=/dev/disk/by-id/ata-Samsung_SSD_870_EVO_1TB_S626NF0R226283M-part1 error=5" Partition is damaged? Can i set the 870EVO offline, delete the Partitions at this SSD and replace/add it back to the mirror?

 

 

Link to comment
15 hours ago, asopala said:

Hey all,

 

Anybody know how to use the Recycle Bin plugin with ZFS?  I have the datasets shared via SMB in the smb.extra config, and I was wondering if it is possible to have it work with the ZFS array for the sake of delete protection.

 

Not aware of Recycle Bin functionality, but you can have ZFS publish snapshots to the Shadowcopy / Previous Versions tab in Windows, so you can view older versions of your shares from Windows:

 

Scroll to "ZFS Snapshots – Shadow copies!!! Part 1, the underlying config" here:

https://forum.level1techs.com/t/zfs-on-unraid-lets-do-it-bonus-shadowcopy-setup-guide-project/148764

 

edit: You can enable Recycle Bin as well:

https://forum.level1techs.com/t/zfs-on-unraid-lets-do-it-bonus-shadowcopy-setup-guide-project/148764/238

Edited by jortan
Link to comment
4 hours ago, Iker said:

ZFS Master 2021.11.09a is live with a few changes, check it out:

 

2021.11.09a

- Add - List of current Datasets at Dataset Creation

- Add - Option for export a Pool

- Fix - Compatibility with RC version of unRAID

Where do we find this. Cannot find this plugin

Link to comment
6 minutes ago, muddro said:

Where do we find this. Cannot find this plugin

 

Do you have the "Community Applications" plugin installed?  This adds the "Apps" tab to unRAID.

 

If you're already searching in "Apps" and still can't find it, what version of unRAID are you running?  I'm still on 6.10.0-rc1, not sure if it's published for rc2 yet?

 

Edited by jortan
Link to comment
5 minutes ago, jortan said:

 

Do you have the "Community Applications" plugin installed?  This adds the "Apps" tab to unRAID.

 

If you're already searching in "Apps" and still can't find it, what version of unRAID are you running?  I'm still on 6.10.0-rc1, not sure if it's published for rc2 yet?

 

Rc2

 

edit: nm it's there. I guess previous version wasn't because I went looking for this yesterday

 

 

Edited by muddro
Link to comment
7 hours ago, jortan said:

 

Not aware of Recycle Bin functionality, but you can have ZFS publish snapshots to the Shadowcopy / Previous Versions tab in Windows, so you can view older versions of your shares from Windows:

 

Scroll to "ZFS Snapshots – Shadow copies!!! Part 1, the underlying config" here:

https://forum.level1techs.com/t/zfs-on-unraid-lets-do-it-bonus-shadowcopy-setup-guide-project/148764

 

edit: You can enable Recycle Bin as well:

https://forum.level1techs.com/t/zfs-on-unraid-lets-do-it-bonus-shadowcopy-setup-guide-project/148764/238

 

Would the recycle bin vfs object setting need to be applied to every SMB share in smbextra.config?  I'm running into a 2048 character limit and have been dealing with that.

Link to comment
51 minutes ago, asopala said:

 

Would the recycle bin vfs object setting need to be applied to every SMB share in smbextra.config?  I'm running into a 2048 character limit and have been dealing with that.

 

You could use the Global Section, but  I'm not really an expert in SMB, so YMMV; however, another option is editing the smb-extra.conf file directly from the flash/flash-share so you don't have the 2048 characters limit from the GUI.

Link to comment
1 minute ago, Iker said:

 

You could use the Global Section, but  I'm not really an expert in SMB, so YMMV; however, another option is editing the smb-extra.conf file directly from the flash/flash-share so you don't have the 2048 characters limit from the GUI.

 

I think that's the way to go.  I also didn't realize it's a GUI issue with the smb-extra.conf file being limited to 2kb.  That makes everything easier.

Link to comment

By the way, anybody figure out how to make a successful Time Machine dataset in ZFS?  I did the usual protocol of making a new dataset in the terminal, and using SpaceInvaderOne's code for smb-extra.config to have time machine on an unassigned device (as linked a while back from @etsjessey, but I can't for the life of me get it to show up as a time machine device on my Mac.  Everything else shows up normally, but no dice.  Here's the code I got, right after the rootshare configuration.

 

[Alex Time Machine]
  comment =
  ea support = Yes path = /mnt/tank/AlexTimeMachine
  browseable = yes
  guest ok = no
  valid users = asopala
  write list = asopala
  writeable = yes
  vfs objects = catia fruit streams_xattr
  fruit:time machine max size = 1000 G
  fruit:encoding = native
  fruit:locking = netatalk
  fruit:metadata = netatalk
  fruit:resource = file
  fruit:time machine = yes
  fruit:advertise_fullsync = true
  fruit:model = MacSamba
  fruit:posix_rename = yes
  fruit:zero_file_id = yes
  fruit:veto_appledouble = no
  fruit:wipe_intentionally_left_blank_rfork = yes 
  fruit:delete_empty_adfiles = yes
  durable handles = yes
  kernel oplocks = no
  kernel share modes = no
  posix locking = no
  inherit acls = yes

#unassigned_devices_start
#Unassigned devices share includes
   include = /tmp/unassigned.devices/smb-settings.conf
#unassigned_devices_end

 

Link to comment
16 minutes ago, asopala said:

By the way, anybody figure out how to make a successful Time Machine dataset in ZFS?  I did the usual protocol of making a new dataset in the terminal, and using SpaceInvaderOne's code for smb-extra.config to have time machine on an unassigned device (as linked a while back from @etsjessey, but I can't for the life of me get it to show up as a time machine device on my Mac.  Everything else shows up normally, but no dice.  Here's the code I got, right after the rootshare configuration.

 

[Alex Time Machine]
  comment =
  ea support = Yes path = /mnt/tank/AlexTimeMachine
  browseable = yes
  guest ok = no
  valid users = asopala
  write list = asopala
  writeable = yes
  vfs objects = catia fruit streams_xattr
  fruit:time machine max size = 1000 G
  fruit:encoding = native
  fruit:locking = netatalk
  fruit:metadata = netatalk
  fruit:resource = file
  fruit:time machine = yes
  fruit:advertise_fullsync = true
  fruit:model = MacSamba
  fruit:posix_rename = yes
  fruit:zero_file_id = yes
  fruit:veto_appledouble = no
  fruit:wipe_intentionally_left_blank_rfork = yes 
  fruit:delete_empty_adfiles = yes
  durable handles = yes
  kernel oplocks = no
  kernel share modes = no
  posix locking = no
  inherit acls = yes

#unassigned_devices_start
#Unassigned devices share includes
   include = /tmp/unassigned.devices/smb-settings.conf
#unassigned_devices_end

 

I just set one up yesterday but I just use the symlink version to create shares and did it through unraids gui after linking a dataset to a folder in the mnt/user/ folder

Link to comment
Just now, asopala said:

 

Is there a guide on how to do that?  I'm not sure where to start.

ln -s source /mnt/user/timemachine

Basically replace source with the path to your time machine dataset. Then a share will be in the unRAID gui called timemachine and you just export the share using the time machine option and attach a user. 

  • Like 1
Link to comment
On 11/11/2021 at 11:34 AM, muddro said:
ln -s source /mnt/user/timemachine

Basically replace source with the path to your time machine dataset. Then a share will be in the unRAID gui called timemachine and you just export the share using the time machine option and attach a user. 

That did it exactly, thanks!

 

Looks like the last thing I need to do is to set up rclone to my google drive (while I can still take advantage of unlimited storage).  Anybody know where's the best place to mount the remote shares so that dockers have access to them, and it syncs the contents of the entire pool?  I didn't see anything along those lines in the thread, and running SpaceInvaderOne's tutorial, I'm not sure where to set the mount point for the remote shares to be mounted and unmounted on startup and shutdown.  Can't use /mnt/disks/subdirectory when I'm not using the array for anything (currently just a dummy drive).

Link to comment

For my main unRAID cache I'm about to replace my single 2.5" 2TB SSD with 2 x 2TB NVME. If you create a cache pool using ZFS, specifically a mirrored ZFS pool of the 2 x 2TB NVME, does unRAID detect this so that you no longer get the 'Some or all files are unprotected' warning for shares that use the pool?

 

I used ZFS in my FreeNAS days, and intend to build a high-speed ZFS scrub pool for video editing eventually. For now, I'm just looking at replacing my existing single cache drive with a mirrored 2 drive pool. I'm planning to stick with BTRFS but the ZFS snapshot functionality is something I'm interested in. To my knowledge, BTRFS also supports snapshots but it's  all command-line and not officially supported by unRAID. The same is true for ZFS - also command-line to use snapshots.

 

Are there any good reasons to choose ZFS over BTRFS? Neither have native unRAID snapshot support and I've been OK with my scrubs of the single drive BTRFS cache. I know others have reported issues with BTRFS and some of those users have even gone with XFS for the main cache drive/pool.

 

Link to comment

My experience was to avoid BTRFS for your cache if you value that data.  I therefore used XFS.  At the time ZFS wasn't an option in the cache, I heard a rumour that it might be now, but I still have my doubts as unraid still don't officially support it.  The thing was I never had a failed cache drive, but I did have multiple BTRFS file system failures once I moved to a BTRFS mirror.  So that defeats the whole point of a mirror which was meant to keep the data safe.  So it ended up being safer to have a single XFS cache.

 

There are lots of other people whom had a similar experience and lots that say that they haven't OR more likely haven't had the issue fixing the BTRFS file system afterward.  That was the thing that got me the most, the filesystem was unrepairable.  That basically doesn't happen on ZFS.

 

For the video scrub pool, using the zfs special metadata device is a good idea - you can pretty much run the scrub pool of standard HDD's then because all the file system info for searching and so on is in the SSD - something to consider anyway.  My system is 100% ZFS now and because of that it doesn't really even need a cache of the style unraid has.

Link to comment
4 minutes ago, JorgeB said:

Unraid won't detect that share (or shares), so there's also no warning, independent of the pool being redundant or not.

 

So to clarify, shares that use the single drive cache pool all have that warning currently. My understanding is that unRAID detects the mirror set if using BTRFS so you no longer get the warning. But it won't do this if I decided to go with ZFS?

 

14 minutes ago, Marshalleq said:

My experience was to avoid BTRFS for your cache if you value that data.  I therefore used XFS.  At the time ZFS wasn't an option in the cache, I heard a rumour that it might be now, but I still have my doubts as unraid still don't officially support it.  The thing was I never had a failed cache drive, but I did have multiple BTRFS file system failures once I moved to a BTRFS mirror.  So that defeats the whole point of a mirror which was meant to keep the data safe.  So it ended up being safer to have a single XFS cache.

 

There are lots of other people whom had a similar experience and lots that say that they haven't OR more likely haven't had the issue fixing the BTRFS file system afterward.  That was the thing that got me the most, the filesystem was unrepairable.  That basically doesn't happen on ZFS.

 

I've been using BTRFS for my single drive cache pool since I started using unRAID about 2.5 years ago. I guess I'm one of the lucky ones as other than actual drive failure, I've never experienced any BTRFS errors that couldn't be corrected by a scrub. Are the folks that report issues with BTRFS using single or multi-drive pools? I do remember the stability of ZFS from my FreeNAS days, but I'm tempted to hold off until unRAID adds native support for it.

 

 

Link to comment
6 minutes ago, AgentXXL said:

 

So to clarify, shares that use the single drive cache pool all have that warning currently. My understanding is that unRAID detects the mirror set if using BTRFS so you no longer get the warning. But it won't do this if I decided to go with ZFS?

 

 

To clarify, if you do this, you'll have a zfs pool but will be made up of unassigned devices. I am not aware of unraid having the capability to make a zfs pool in a cache pool.

 

The ZFS pool you make will likely need to be linked to a folder in /mnt/user/ to expose it to shares, and then set it up to not utilize mover.

Link to comment

I don't think unraid supports zfs in a cache pool.  If it does, I suspect the warning will persist.  However the warning does go away if you use an official Unraid mirror for the cache pool.  I used to run that for the same reason, and that's when I started getting BTRFS issues.

 

Regarding other questions, I do believe I had issues with both mirrors and non mirrors with BTRFS.  I ended up running XFS but ultimately got annoyed having another attempt at BTRFS, also failed (probably did three different spurts over 12 months, each time with issues after a month or so of use).  There's probably something specific to the way I was using it, like perhaps I overfilled it a few times or something but no filesystem should crap itself just because of how I was using it.

 

There was also problems with Unraid's implementation of BTRFS in the GUI which didn't help - initially denied, then later corrected to do with how it created the mirror, metadata and balancing if I recall correctly.  All in all not a good experience.

 

I will never ever use BTRFS again, there is very clearly something wrong with it and you can see those reports if you look around.

 

I often wondered why unraid didn't just offer a standard mirror array i.e. with mdadm, that would have been better than btrfs.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.