FLK Posted May 22, 2015 Share Posted May 22, 2015 Hi, During my last parity check I've been notified about "READ ERRORS" on one of my drive, but I have absolutely no idea what I must do now. Is there any way to know the files involved ? And what exactly does the read error mean, looking at SMART value there's no reallocated sectors, only "Raw Read Error Rate" with a raw value of "25" ? How should I take care of those errors ? replace the disk ? I'm running unRAID 6 RC3 using btrfs on all disks. Here is the syslog with two parity checks : May 22 09:15:29 unFLK kernel: ata3.00: exception Emask 0x0 SAct 0x7fffffff SErr 0x0 action 0x0 May 22 09:15:29 unFLK kernel: ata3.00: irq_stat 0x40000008 May 22 09:15:29 unFLK kernel: ata3.00: failed command: READ FPDMA QUEUED May 22 09:15:29 unFLK kernel: ata3.00: cmd 60/20:38:c8:19:03/03:00:38:01:00/40 tag 7 ncq 409600 in May 22 09:15:29 unFLK kernel: res 41/40:00:d8:1b:03/00:00:38:01:00/40 Emask 0x409 (media error) May 22 09:15:29 unFLK kernel: ata3.00: status: { DRDY ERR } May 22 09:15:29 unFLK kernel: ata3.00: error: { UNC } May 22 09:15:29 unFLK kernel: ata3.00: configured for UDMA/133 May 22 09:15:29 unFLK kernel: sd 3:0:0:0: [sdd] tag#7 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08 May 22 09:15:29 unFLK kernel: sd 3:0:0:0: [sdd] tag#7 Sense Key : 0x3 [current] [descriptor] May 22 09:15:29 unFLK kernel: sd 3:0:0:0: [sdd] tag#7 ASC=0x11 ASCQ=0x4 May 22 09:15:29 unFLK kernel: sd 3:0:0:0: [sdd] tag#7 CDB: opcode=0x88 88 00 00 00 00 01 38 03 19 c8 00 00 03 20 00 00 May 22 09:15:29 unFLK kernel: blk_update_request: I/O error, dev sdd, sector 5234695128 May 22 09:15:29 unFLK kernel: ata3: EH complete May 22 09:15:29 unFLK kernel: md: disk4 read error, sector=5234695064 May 22 09:15:29 unFLK kernel: md: disk4 read error, sector=5234695072 May 22 09:15:29 unFLK kernel: md: disk4 read error, sector=5234695080 May 22 09:15:29 unFLK kernel: md: disk4 read error, sector=5234695088 May 22 09:15:29 unFLK kernel: md: disk4 read error, sector=5234695096 May 22 09:15:29 unFLK kernel: md: disk4 read error, sector=5234695104 May 22 09:15:29 unFLK kernel: md: disk4 read error, sector=5234695112 May 22 09:15:29 unFLK kernel: md: disk4 read error, sector=5234695120 May 22 09:15:29 unFLK kernel: md: disk4 read error, sector=5234695128 May 22 09:15:29 unFLK kernel: md: disk4 read error, sector=5234695136 May 22 09:15:29 unFLK kernel: md: disk4 read error, sector=5234695144 May 22 09:15:29 unFLK kernel: md: disk4 read error, sector=5234695152 May 22 09:15:29 unFLK kernel: md: disk4 read error, sector=5234695160 May 22 09:15:29 unFLK kernel: md: disk4 read error, sector=5234695168 May 22 09:15:29 unFLK kernel: md: disk4 read error, sector=5234695176 May 22 09:15:29 unFLK kernel: md: disk4 read error, sector=5234695184 May 22 09:15:29 unFLK kernel: md: disk4 read error, sector=5234695192 May 22 09:15:29 unFLK kernel: md: disk4 read error, sector=5234695200 May 22 09:15:29 unFLK kernel: md: disk4 read error, sector=5234695208 May 22 09:15:29 unFLK kernel: md: disk4 read error, sector=5234695216 May 22 09:15:29 unFLK kernel: md: disk4 read error, sector=5234695224 May 22 09:15:29 unFLK kernel: md: disk4 read error, sector=5234695232 May 22 09:15:29 unFLK kernel: md: disk4 read error, sector=5234695240 May 22 09:15:29 unFLK kernel: md: disk4 read error, sector=5234695248 May 22 09:15:29 unFLK kernel: md: disk4 read error, sector=5234695256 May 22 09:15:29 unFLK kernel: md: disk4 read error, sector=5234695264 May 22 09:15:29 unFLK kernel: md: disk4 read error, sector=5234695272 May 22 09:15:29 unFLK kernel: md: disk4 read error, sector=5234695280 May 22 09:15:29 unFLK kernel: md: disk4 read error, sector=5234695288 May 22 09:15:29 unFLK kernel: md: disk4 read error, sector=5234695296 May 22 09:15:29 unFLK kernel: md: disk4 read error, sector=5234695304 May 22 09:15:29 unFLK kernel: md: disk4 read error, sector=5234695312 May 22 09:15:29 unFLK kernel: md: disk4 read error, sector=5234695320 May 22 09:15:29 unFLK kernel: md: disk4 read error, sector=5234695328 May 22 09:15:42 unFLK kernel: ata3.00: exception Emask 0x0 SAct 0x1c000000 SErr 0x0 action 0x0 May 22 09:15:42 unFLK kernel: ata3.00: irq_stat 0x40000008 May 22 09:15:42 unFLK kernel: ata3.00: failed command: READ FPDMA QUEUED May 22 09:15:42 unFLK kernel: ata3.00: cmd 60/40:e0:00:21:03/05:00:38:01:00/40 tag 28 ncq 688128 in May 22 09:15:42 unFLK kernel: res 41/40:00:a0:23:03/00:00:38:01:00/40 Emask 0x409 (media error) May 22 09:15:42 unFLK kernel: ata3.00: status: { DRDY ERR } May 22 09:15:42 unFLK kernel: ata3.00: error: { UNC } May 22 09:15:42 unFLK kernel: ata3.00: configured for UDMA/133 May 22 09:15:42 unFLK kernel: sd 3:0:0:0: [sdd] tag#28 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08 May 22 09:15:42 unFLK kernel: sd 3:0:0:0: [sdd] tag#28 Sense Key : 0x3 [current] [descriptor] May 22 09:15:42 unFLK kernel: sd 3:0:0:0: [sdd] tag#28 ASC=0x11 ASCQ=0x4 May 22 09:15:42 unFLK kernel: sd 3:0:0:0: [sdd] tag#28 CDB: opcode=0x88 88 00 00 00 00 01 38 03 21 00 00 00 05 40 00 00 May 22 09:15:42 unFLK kernel: blk_update_request: I/O error, dev sdd, sector 5234697120 May 22 09:15:42 unFLK kernel: ata3: EH complete May 22 09:15:42 unFLK kernel: md: disk4 read error, sector=5234697056 May 22 09:15:42 unFLK kernel: md: disk4 read error, sector=5234697064 May 22 09:15:42 unFLK kernel: md: disk4 read error, sector=5234697072 May 22 09:15:42 unFLK kernel: md: disk4 read error, sector=5234697080 May 22 09:15:42 unFLK kernel: md: disk4 read error, sector=5234697088 May 22 09:15:42 unFLK kernel: md: disk4 read error, sector=5234697096 May 22 09:15:42 unFLK kernel: md: disk4 read error, sector=5234697104 May 22 09:15:42 unFLK kernel: md: disk4 read error, sector=5234697112 May 22 09:15:42 unFLK kernel: md: disk4 read error, sector=5234697120 May 22 09:15:42 unFLK kernel: md: disk4 read error, sector=5234697128 May 22 09:15:42 unFLK kernel: md: disk4 read error, sector=5234697136 May 22 09:15:42 unFLK kernel: md: disk4 read error, sector=5234697144 May 22 09:15:42 unFLK kernel: md: disk4 read error, sector=5234697152 May 22 09:15:42 unFLK kernel: md: disk4 read error, sector=5234697160 May 22 09:15:42 unFLK kernel: md: disk4 read error, sector=5234697168 May 22 09:15:42 unFLK kernel: md: disk4 read error, sector=5234697176 May 22 09:15:42 unFLK kernel: md: disk4 read error, sector=5234697184 May 22 09:15:42 unFLK kernel: md: disk4 read error, sector=5234697192 May 22 09:15:42 unFLK kernel: md: disk4 read error, sector=5234697200 May 22 09:15:42 unFLK kernel: md: disk4 read error, sector=5234697208 May 22 09:15:42 unFLK kernel: md: disk4 read error, sector=5234697216 May 22 09:15:42 unFLK kernel: md: disk4 read error, sector=5234697224 May 22 09:15:42 unFLK kernel: md: disk4 read error, sector=5234697232 May 22 09:15:42 unFLK kernel: md: disk4 read error, sector=5234697240 May 22 09:15:42 unFLK kernel: md: disk4 read error, sector=5234697248 May 22 09:15:42 unFLK kernel: md: disk4 read error, sector=5234697256 May 22 09:15:42 unFLK kernel: md: disk4 read error, sector=5234697264 May 22 09:15:42 unFLK kernel: md: disk4 read error, sector=5234697272 May 22 09:15:42 unFLK kernel: md: disk4 read error, sector=5234697280 May 22 09:15:42 unFLK kernel: md: disk4 read error, sector=5234697288 May 22 09:15:42 unFLK kernel: md: disk4 read error, sector=5234697296 May 22 09:15:42 unFLK kernel: md: disk4 read error, sector=5234697304 May 22 09:15:42 unFLK kernel: md: disk4 read error, sector=5234697312 May 22 09:15:42 unFLK kernel: md: disk4 read error, sector=5234697320 May 22 09:15:42 unFLK kernel: md: disk4 read error, sector=5234697328 May 22 09:15:42 unFLK kernel: md: disk4 read error, sector=5234697336 May 22 09:15:42 unFLK kernel: md: disk4 read error, sector=5234697344 May 22 09:15:42 unFLK kernel: md: disk4 read error, sector=5234697352 May 22 09:15:42 unFLK kernel: md: disk4 read error, sector=5234697360 May 22 09:15:42 unFLK kernel: md: disk4 read error, sector=5234697368 May 22 09:15:42 unFLK kernel: md: disk4 read error, sector=5234697376 May 22 09:15:42 unFLK kernel: md: disk4 read error, sector=5234697384 May 22 09:15:42 unFLK kernel: md: disk4 read error, sector=5234697392 May 22 09:15:42 unFLK kernel: md: disk4 read error, sector=5234697400 May 22 09:15:42 unFLK kernel: md: disk4 read error, sector=5234697408 May 22 09:15:42 unFLK kernel: md: disk4 read error, sector=5234697416 May 22 09:15:42 unFLK kernel: md: disk4 read error, sector=5234697424 May 22 09:15:42 unFLK kernel: md: disk4 read error, sector=5234697432 May 22 09:15:42 unFLK kernel: md: disk4 read error, sector=5234697440 May 22 09:15:42 unFLK kernel: md: disk4 read error, sector=5234697448 May 22 09:15:42 unFLK kernel: md: disk4 read error, sector=5234697456 May 22 09:15:42 unFLK kernel: md: disk4 read error, sector=5234697464 May 22 09:15:42 unFLK kernel: md: disk4 read error, sector=5234697472 May 22 09:15:42 unFLK kernel: md: disk4 read error, sector=5234697480 May 22 09:15:42 unFLK kernel: md: disk4 read error, sector=5234697488 May 22 09:15:42 unFLK kernel: md: disk4 read error, sector=5234697496 May 22 09:15:42 unFLK kernel: md: disk4 read error, sector=5234697504 May 22 09:15:42 unFLK kernel: md: disk4 read error, sector=5234697512 May 22 09:15:42 unFLK kernel: md: disk4 read error, sector=5234697520 May 22 09:15:42 unFLK kernel: md: disk4 read error, sector=5234697528 May 22 09:15:42 unFLK kernel: md: disk4 read error, sector=5234697536 May 22 09:15:42 unFLK kernel: md: disk4 read error, sector=5234697544 May 22 09:15:42 unFLK kernel: md: disk4 read error, sector=5234697552 May 22 09:15:42 unFLK kernel: md: disk4 read error, sector=5234697560 May 22 09:15:42 unFLK kernel: md: disk4 read error, sector=5234697568 May 22 09:15:42 unFLK kernel: md: disk4 read error, sector=5234697576 May 22 09:15:42 unFLK kernel: md: disk4 read error, sector=5234697584 May 22 09:15:42 unFLK kernel: md: disk4 read error, sector=5234697592 May 22 09:15:42 unFLK kernel: md: disk4 read error, sector=5234697600 May 22 09:15:42 unFLK kernel: md: disk4 read error, sector=5234697608 May 22 09:15:42 unFLK kernel: md: disk4 read error, sector=5234697616 May 22 09:15:42 unFLK kernel: md: disk4 read error, sector=5234697624 May 22 09:15:42 unFLK kernel: md: disk4 read error, sector=5234697632 May 22 09:15:42 unFLK kernel: md: disk4 read error, sector=5234697640 May 22 09:15:42 unFLK kernel: md: disk4 read error, sector=5234697648 May 22 09:15:42 unFLK kernel: md: disk4 read error, sector=5234697656 May 22 09:15:42 unFLK kernel: md: disk4 read error, sector=5234697664 May 22 09:15:42 unFLK kernel: md: disk4 read error, sector=5234697672 May 22 09:15:42 unFLK kernel: md: disk4 read error, sector=5234697680 May 22 09:15:42 unFLK kernel: md: disk4 read error, sector=5234697688 May 22 09:15:42 unFLK kernel: md: disk4 read error, sector=5234697696 May 22 09:15:42 unFLK kernel: md: disk4 read error, sector=5234697704 May 22 09:15:42 unFLK kernel: md: disk4 read error, sector=5234697712 May 22 09:15:42 unFLK kernel: md: disk4 read error, sector=5234697720 May 22 09:15:49 unFLK kernel: ata3.00: exception Emask 0x0 SAct 0x6 SErr 0x0 action 0x0 May 22 09:15:49 unFLK kernel: ata3.00: irq_stat 0x40000008 May 22 09:15:49 unFLK kernel: ata3.00: failed command: READ FPDMA QUEUED May 22 09:15:49 unFLK kernel: ata3.00: cmd 60/40:08:40:26:03/05:00:38:01:00/40 tag 1 ncq 688128 in May 22 09:15:49 unFLK kernel: res 41/40:00:57:2a:03/00:00:38:01:00/40 Emask 0x409 (media error) May 22 09:15:49 unFLK kernel: ata3.00: status: { DRDY ERR } May 22 09:15:49 unFLK kernel: ata3.00: error: { UNC } May 22 09:15:49 unFLK kernel: ata3.00: configured for UDMA/133 May 22 09:15:49 unFLK kernel: sd 3:0:0:0: [sdd] tag#1 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08 May 22 09:15:49 unFLK kernel: sd 3:0:0:0: [sdd] tag#1 Sense Key : 0x3 [current] [descriptor] May 22 09:15:49 unFLK kernel: sd 3:0:0:0: [sdd] tag#1 ASC=0x11 ASCQ=0x4 May 22 09:15:49 unFLK kernel: sd 3:0:0:0: [sdd] tag#1 CDB: opcode=0x88 88 00 00 00 00 01 38 03 26 40 00 00 05 40 00 00 May 22 09:15:49 unFLK kernel: blk_update_request: I/O error, dev sdd, sector 5234698839 May 22 09:15:49 unFLK kernel: ata3: EH complete May 22 09:15:49 unFLK kernel: md: disk4 read error, sector=5234698768 May 22 09:15:49 unFLK kernel: md: disk4 read error, sector=5234698776 May 22 09:15:49 unFLK kernel: md: disk4 read error, sector=5234698784 May 22 09:15:49 unFLK kernel: md: disk4 read error, sector=5234698792 May 22 09:15:49 unFLK kernel: md: disk4 read error, sector=5234698800 May 22 09:15:49 unFLK kernel: md: disk4 read error, sector=5234698808 May 22 09:15:49 unFLK kernel: md: disk4 read error, sector=5234698816 May 22 09:15:49 unFLK kernel: md: disk4 read error, sector=5234698824 May 22 09:15:49 unFLK kernel: md: disk4 read error, sector=5234698832 May 22 09:15:49 unFLK kernel: md: disk4 read error, sector=5234698840 May 22 09:15:49 unFLK kernel: md: disk4 read error, sector=5234698848 May 22 09:15:49 unFLK kernel: md: disk4 read error, sector=5234698856 May 22 09:15:49 unFLK kernel: md: disk4 read error, sector=5234698864 May 22 09:15:49 unFLK kernel: md: disk4 read error, sector=5234698872 May 22 09:15:49 unFLK kernel: md: disk4 read error, sector=5234698880 May 22 09:15:49 unFLK kernel: md: disk4 read error, sector=5234698888 May 22 09:15:49 unFLK kernel: md: disk4 read error, sector=5234698896 May 22 09:15:49 unFLK kernel: md: disk4 read error, sector=5234698904 May 22 09:15:49 unFLK kernel: md: disk4 read error, sector=5234698912 May 22 09:15:49 unFLK kernel: md: disk4 read error, sector=5234698920 May 22 09:15:49 unFLK kernel: md: disk4 read error, sector=5234698928 May 22 09:15:49 unFLK kernel: md: disk4 read error, sector=5234698936 May 22 09:15:49 unFLK kernel: md: disk4 read error, sector=5234698944 May 22 09:15:49 unFLK kernel: md: disk4 read error, sector=5234698952 May 22 09:15:49 unFLK kernel: md: disk4 read error, sector=5234698960 May 22 09:15:49 unFLK kernel: md: disk4 read error, sector=5234698968 May 22 09:15:49 unFLK kernel: md: disk4 read error, sector=5234698976 May 22 09:15:49 unFLK kernel: md: disk4 read error, sector=5234698984 May 22 09:15:49 unFLK kernel: md: disk4 read error, sector=5234698992 May 22 09:15:49 unFLK kernel: md: disk4 read error, sector=5234699000 May 22 09:15:49 unFLK kernel: md: disk4 read error, sector=5234699008 May 22 09:15:49 unFLK kernel: md: disk4 read error, sector=5234699016 May 22 09:15:49 unFLK kernel: md: disk4 read error, sector=5234699024 May 22 09:15:49 unFLK kernel: md: disk4 read error, sector=5234699032 May 22 09:15:49 unFLK kernel: md: disk4 read error, sector=5234699040 May 22 09:15:49 unFLK kernel: md: disk4 read error, sector=5234699048 May 22 09:15:49 unFLK kernel: md: disk4 read error, sector=5234699056 May 22 09:15:49 unFLK kernel: md: disk4 read error, sector=5234699064 May 22 09:16:01 unFLK sSMTP[12529]: Creating SSL connection to host May 22 09:16:01 unFLK sSMTP[12529]: SSL connection using DHE-RSA-AES256-GCM-SHA384 May 22 09:16:03 unFLK sSMTP[12529]: Sent mail for xxxxxxxxxxxxxxxx (221 2.0.0 Bye) uid=0 username=root outbytes=766 May 22 09:25:53 unFLK kernel: ata3.00: exception Emask 0x0 SAct 0x7fffffff SErr 0x0 action 0x0 May 22 09:25:53 unFLK kernel: ata3.00: irq_stat 0x40000008 May 22 09:25:53 unFLK kernel: ata3.00: failed command: READ FPDMA QUEUED May 22 09:25:53 unFLK kernel: ata3.00: cmd 60/40:00:60:fe:1e/05:00:39:01:00/40 tag 0 ncq 688128 in May 22 09:25:53 unFLK kernel: res 41/40:00:77:03:1f/00:00:39:01:00/40 Emask 0x409 (media error) May 22 09:25:53 unFLK kernel: ata3.00: status: { DRDY ERR } May 22 09:25:53 unFLK kernel: ata3.00: error: { UNC } May 22 09:25:53 unFLK kernel: ata3.00: configured for UDMA/133 May 22 09:25:53 unFLK kernel: sd 3:0:0:0: [sdd] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08 May 22 09:25:53 unFLK kernel: sd 3:0:0:0: [sdd] tag#0 Sense Key : 0x3 [current] [descriptor] May 22 09:25:53 unFLK kernel: sd 3:0:0:0: [sdd] tag#0 ASC=0x11 ASCQ=0x4 May 22 09:25:53 unFLK kernel: sd 3:0:0:0: [sdd] tag#0 CDB: opcode=0x88 88 00 00 00 00 01 39 1e fe 60 00 00 05 40 00 00 May 22 09:25:53 unFLK kernel: blk_update_request: I/O error, dev sdd, sector 5253301111 May 22 09:25:53 unFLK kernel: ata3: EH complete May 22 09:25:53 unFLK kernel: md: disk4 read error, sector=5253301040 May 22 09:25:53 unFLK kernel: md: disk4 read error, sector=5253301048 May 22 09:25:53 unFLK kernel: md: disk4 read error, sector=5253301056 May 22 09:25:53 unFLK kernel: md: disk4 read error, sector=5253301064 May 22 09:25:53 unFLK kernel: md: disk4 read error, sector=5253301072 May 22 09:25:53 unFLK kernel: md: disk4 read error, sector=5253301080 May 22 10:42:01 unFLK kernel: md: sync done. time=36719sec May 22 10:42:01 unFLK kernel: md: recovery thread sync completion status: 0 May 22 10:47:13 unFLK emhttp: read_line: client closed the connection May 22 10:47:13 unFLK emhttp: read_line: client closed the connection May 22 10:47:13 unFLK emhttp: read_line: client closed the connection May 22 10:47:13 unFLK emhttp: read_line: client closed the connection May 22 10:47:13 unFLK emhttp: read_line: client closed the connection May 22 11:04:56 unFLK php: /sbin/btrfs scrub start -B -r /mnt/disk4 &>/dev/null & May 22 11:30:29 unFLK emhttp: /usr/bin/tail -n 42 -f /var/log/syslog 2>&1 May 22 11:42:02 unFLK kernel: mdcmd (46): spindown 0 May 22 11:42:02 unFLK kernel: mdcmd (47): spindown 1 May 22 12:55:09 unFLK kernel: mdcmd (48): spindown 0 May 22 12:55:29 unFLK kernel: mdcmd (49): spindown 2 May 22 13:00:02 unFLK kernel: mdcmd (50): spindown 1 May 22 14:10:45 unFLK kernel: mdcmd (51): check CORRECT May 22 14:10:45 unFLK kernel: md: recovery thread woken up ... May 22 14:10:45 unFLK kernel: md: recovery thread checking parity... May 22 14:10:45 unFLK kernel: md: using 1536k window, over a total of 2930266532 blocks. May 22 14:11:01 unFLK sSMTP[18014]: Creating SSL connection to host May 22 14:11:01 unFLK sSMTP[18014]: SSL connection using DHE-RSA-AES256-GCM-SHA384 May 22 14:11:03 unFLK sSMTP[18014]: Sent mail for xxxxxxxxxxxxxxxx (221 2.0.0 Bye) uid=0 username=root outbytes=688 May 22 20:46:12 unFLK kernel: mdcmd (52): spindown 2 May 23 00:14:50 unFLK emhttp: /usr/bin/tail -n 42 -f /var/log/syslog 2>&1 May 23 00:16:07 unFLK kernel: md: sync done. time=36321sec May 23 00:16:07 unFLK kernel: md: recovery thread sync completion status: 0 May 23 00:20:01 unFLK sSMTP[25703]: Creating SSL connection to host May 23 00:20:01 unFLK sSMTP[25703]: SSL connection using DHE-RSA-AES256-GCM-SHA384 May 23 00:20:02 unFLK sSMTP[25703]: Sent mail for xxxxxxxxxxxxxxxx (221 2.0.0 Bye) uid=0 username=root outbytes=1342 May 23 00:40:09 unFLK kernel: mdcmd (53): spindown 2 May 23 00:41:55 unFLK emhttp: /usr/bin/tail -n 42 -f /var/log/syslog 2>&1 thanks! Quote Link to comment
RobJ Posted May 23, 2015 Share Posted May 23, 2015 Please see Need help? Read me first! We need a full syslog, and a SMART report for Disk 4. Quote Link to comment
FLK Posted May 23, 2015 Author Share Posted May 23, 2015 Sorry about that ! attached both files : syslog.txt smart.txt Quote Link to comment
RobJ Posted May 23, 2015 Share Posted May 23, 2015 Well, I'm a bit stumped. Disk 4 is clearly reporting bad sectors, physical problems with the sector media, uncorrectable. But when we ask the SMART portion of the drive, it says "nope, no problems here, everything's fine"! So who's right? Have we got an internal mutiny? You recently did a SMART short test, try the long test, which will force SMART to examine every single sector. Polling time for the long test says 393 minutes, a little over 6.5 hours! Seems a little long, but maybe not. Some minor oddities - * The motherboard is claiming 6.0 gbps speeds for the 6 onboard SATA ports, but in this syslog, the last 4 ports only linked up at 3.0 gbps. The first 2 did link at 6.0 gbps. That means the SSD and your parity drive are on the fastest ports, which is desirable. But there is no explanation why the other 4 ports were not faster. I noticed this on Disk 4, which is a Red like the parity drive, and connected right next to it. Have to wonder if your motherboard manufacturer cheated here. I don't know if you will see a real difference or not, but it might be interesting to check and compare top speeds for all 4 WD Reds. Use the hdparm commands below and compare the very last numbers (ignore the other numbers)- Parity on 2nd port (6.0): hdparm -tT /dev/sdc Disk 4 on 3rd port (3.0): hdparm -tT /dev/sdd Disk 3 on 4th port (3.0): hdparm -tT /dev/sde Disk 1 on 6th port (3.0): hdparm -tT /dev/sdg * Syslog shows xenbr0 being set up and used. I thought all xen stuff had been removed. Or perhaps you have something manually configuring xen networking functionality? * Your -rc3 syslog shows the snippet below, of an array of shareColor, shareFree, and shareSize vars. I show only the first 2, the 0 and 1 entries, but you have 4 entries. Another -rc3 syslog had 7 entries, so I'm guessing the count corresponds to the number of shares found? This appears to be an incomplete feature, one built in 2 parts, one part setting up the vars, and another part using them. Obviously, in -rc3, one of the parts was not finished, so I suspect that in -rc4 either this will be removed or a new display feature will be revealed. To head off criticism of a new feature added to the -rc's, this appears to be only cosmetic. May 20 10:17:17 unFLK emhttp: shareColor.0 not found May 20 10:17:17 unFLK emhttp: shareFree.0 not found May 20 10:17:17 unFLK emhttp: shareSize.0 not found May 20 10:17:17 unFLK emhttp: shareColor.1 not found May 20 10:17:17 unFLK emhttp: shareFree.1 not found May 20 10:17:17 unFLK emhttp: shareSize.1 not found May 20 10:17:17 unFLK emhttp: shareColor.2 not found ...etc... Quote Link to comment
Frank1940 Posted May 23, 2015 Share Posted May 23, 2015 I would suggest opening up the case and double checking that all of SATA cables to the hard drives are securely in place on both ends. If you have any SATA cables that use the mechanical locking devices to secure the cables, replace them. (One manufacturer changed the connector design on his hard drives and many (if not most) of the locking type SATA cables do not provide a reliable electrical connection! This would not apply if you are using a quick-change hard drive cages.) Quote Link to comment
trurl Posted May 23, 2015 Share Posted May 23, 2015 ...One manufacturer changed the connector design on his hard drives and many (if not most) of the locking type SATA cables do not provide a reliable electrical connection!...I seem to have missed this newsflash. Could you elaborate? As in, name names? Quote Link to comment
Frank1940 Posted May 23, 2015 Share Posted May 23, 2015 ...One manufacturer changed the connector design on his hard drives and many (if not most) of the locking type SATA cables do not provide a reliable electrical connection!...I seem to have missed this newsflash. Could you elaborate? As in, name names? I should have known (actually dreading) someone would would ask for this information! I did a search on the vendor I thought was involved and found the thread: http://lime-technology.com/forum/index.php?topic=36065.msg335979#msg335979 Quote Link to comment
FLK Posted May 24, 2015 Author Share Posted May 24, 2015 Well, I'm a bit stumped. Disk 4 is clearly reporting bad sectors, physical problems with the sector media, uncorrectable. But when we ask the SMART portion of the drive, it says "nope, no problems here, everything's fine"! So who's right? Have we got an internal mutiny? You recently did a SMART short test, try the long test, which will force SMART to examine every single sector. Polling time for the long test says 393 minutes, a little over 6.5 hours! Seems a little long, but maybe not. I've run the long test and it "Completed without error"... Minor question here, while running the long test I noticed that the disk spun down... so I disabled the "Spin down delay" and restarted the test again, maybe there's a bug to fix here? (prevent the disk from spinning down while running the test? or maybe the test was still running... but the UI was misleading since I got a message saying something like "can't run the test while disk in spun down"...) Some minor oddities - * The motherboard is claiming 6.0 gbps speeds for the 6 onboard SATA ports, but in this syslog, the last 4 ports only linked up at 3.0 gbps. The first 2 did link at 6.0 gbps. That means the SSD and your parity drive are on the fastest ports, which is desirable. But there is no explanation why the other 4 ports were not faster. I noticed this on Disk 4, which is a Red like the parity drive, and connected right next to it. Have to wonder if your motherboard manufacturer cheated here. I don't know if you will see a real difference or not, but it might be interesting to check and compare top speeds for all 4 WD Reds. Use the hdparm commands below and compare the very last numbers (ignore the other numbers)- Parity on 2nd port (6.0): hdparm -tT /dev/sdc Disk 4 on 3rd port (3.0): hdparm -tT /dev/sdd Disk 3 on 4th port (3.0): hdparm -tT /dev/sde Disk 1 on 6th port (3.0): hdparm -tT /dev/sdg You are right about my motherboard, 2 6gbs and 4 3gbs. And the hdparm command give me almost same results on all RED disk (~150MB/s on buffered disk reads) * Syslog shows xenbr0 being set up and used. I thought all xen stuff had been removed. Or perhaps you have something manually configuring xen networking functionality? That's the bridge name in the network settings, did not bother to change it after the last update so I guess it will stay here for ever I also checked the cable thing (as noticed by Frank1940), everything looks fine to me. The WD Disk with read errors is a refurbished one that I just get few weeks ago, so I'll send it back again no matter what, but how can I know if my data are safe? will the parity disk rebuild the data involved with the read errors when I'll replace the disk ? Any way to list the files affected ? I'm really in the dark here, what would you do ? Thanks for the support! Quote Link to comment
RobJ Posted May 25, 2015 Share Posted May 25, 2015 I've run the long test and it "Completed without error"... That would seem to indicate the drive is fine. Which means we still have no explanation for the read errors. This drive is odd, doesn't surprise me when you say it's refurbished. I just noticed another oddity, the SMART report claims it has a rotation rate of 5400 rpm! That does not seem possible for a WD Red. I suspect that on refurbishing, they reset much of the SMART numbers, and mistakenly set that wrong. I would try a parity check now, and if it completes without any issues, then the drive and array are probably fine. At that point, you could continue with the drive, or safely rebuild it. I wouldn't consider rebuilding without a successful parity check. Minor question here, while running the long test I noticed that the disk spun down... so I disabled the "Spin down delay" and restarted the test again, maybe there's a bug to fix here? (prevent the disk from spinning down while running the test? or maybe the test was still running... but the UI was misleading since I got a message saying something like "can't run the test while disk in spun down"...) It's a minor issue we have lived with, and keep forgetting to mention to others, because it comes up rather infrequently. It has just now been added to the webGui, so I hope that a temporary disabling of spin down is also added, for this test. In this case, it was my fault for forgetting to mention it. Quote Link to comment
Frank1940 Posted May 25, 2015 Share Posted May 25, 2015 ... I just noticed another oddity, the SMART report claims it has a rotation rate of 5400 rpm! That does not seem possible for a WD Red. ..... I have a WD Red 3TB and the smart report says that it is a 5400rpm drive. So that part is not an anomaly. Quote Link to comment
FLK Posted May 25, 2015 Author Share Posted May 25, 2015 Parity check started I've always assumed that my RED disks were 5400RPM but you're right, when I look at the other's SMART reports, there's no rotation rate indication. But since they are not running the same firmware version... When I received the refurbished one I noticed (it's written on the big sticker) it was running "NASWARE 3.0" while the others were running "NASWARE 2.0" (Firmware Version 82.00A82 vs 80.00A80). Maybe that's something "new". Quote Link to comment
garycase Posted May 25, 2015 Share Posted May 25, 2015 ... I just noticed another oddity, the SMART report claims it has a rotation rate of 5400 rpm! That does not seem possible for a WD Red. Sounds about right. I know the Seagate NAS units that run at 5900 rpm have slightly better performance than the Reds (except for the 6TB Red, which has a 20% higher areal density than the 1TB/platter smaller units). WD doesn't actually state the rotational rate in their spec sheets ... calling it "IntelliPower" => implying a variable rotational rate, but I suspect they're pretty much fixed at ~ 5400rpm. "IntelliPower" is just a marketing term that essentially means "we don't want to say " Quote Link to comment
RobJ Posted May 25, 2015 Share Posted May 25, 2015 ... I just noticed another oddity, the SMART report claims it has a rotation rate of 5400 rpm! That does not seem possible for a WD Red. ..... I have a WD Red 3TB and the smart report says that it is a 5400rpm drive. So that part is not an anomaly. Thanks everyone, just shows my ignorance, and some bad assumptions. I had thought the way they were talked about, and from their cost, that they were high performance drives comparable to the WD Blacks, but configured differently for the needs of a NAS. OK, an ignorant question, if they only spin at 5400 rpm, WHY do they cost so much? Quote Link to comment
Frank1940 Posted May 25, 2015 Share Posted May 25, 2015 ... I just noticed another oddity, the SMART report claims it has a rotation rate of 5400 rpm! That does not seem possible for a WD Red. ..... I have a WD Red 3TB and the smart report says that it is a 5400rpm drive. So that part is not an anomaly. Thanks everyone, just shows my ignorance, and some bad assumptions. I had thought the way they were talked about, and from their cost, that they were high performance drives comparable to the WD Blacks, but configured differently for the needs of a NAS. OK, an ignorant question, if they only spin at 5400 rpm, WHY do they cost so much? Pricing is a Marketing Department decision! They evaluate things like warranty period ( a marketing decision that has nothing to do with MTBF data) and the cost of any failures during that warranty period, the intended use of the device, the price of any competitive product that is intended in fill the same market niche, and the perceived value of the product to the potential buyer. If they think that enough of us are willing to pay a bit more for this product , they will set the price accordingly! Quote Link to comment
FLK Posted May 26, 2015 Author Share Posted May 26, 2015 READ ERRORS, Season 1 Episode 5 Parity Check is OK, and no read errors in the process... I really hate when it stop working and then work again without doing anything. Anyway I started an "advance" replacement from WD so I wont take any risk here. So last question, is there any way to "ask" the disk : "tell me which file is located at sector xxxx" ? (Since I have sector's list from syslog) Quote Link to comment
garycase Posted May 26, 2015 Share Posted May 26, 2015 ... if they only spin at 5400 rpm, WHY do they cost so much? My 1TB SSD doesn't spin at all and it costs almost triple the price of a 4TB WD Red ... more seriously, the Reds are designed to run cool; have specific anti-vibration features built in to minimize drive vibrations in 24/7 operation; and clearly have at least a bit factored into the price to pay for the longer warranty. I really don't think they're all that expensive => the 3 and 4TB versions are typically under $40/TB, and can be found on sale for ~ $35/TB or so. By comparison, the 4TB WD Blacks are over $50/TB [Currently $212 at Newegg and $239 at Amazon] ... and with the 1TB/platter areal density (1.2TB/platter for the 6TB units) the sustained data rates ... even on the slowest inner cylinders ... are well above the limitations of a Gb network, so there's little real-life impact of the slower rotation speed [the seek times are indeed longer than with 7200rpm units; so if you're doing a lot of consecutive writes without a cache drive you'll notice a small difference]. Quote Link to comment
RobJ Posted May 26, 2015 Share Posted May 26, 2015 So last question, is there any way to "ask" the disk : "tell me which file is located at sector xxxx" ? (Since I have sector's list from syslog) It's possible, but a fair amount of work. Try Googling "badblocks ReiserFS", and look for an article or 2 that discusses how to calculate the file (if any) at a bad block location. I should caution you though that your sector list is very misleading, because of the varying block sizes being used at different levels of the disk I/O process. For the future, you might want to look into bunker or bitrot, which run on the unRAID server, or CORZ Checksum, which runs in Windows. They create checksum files, for use in detecting bit corruption, file corruption. Quote Link to comment
garycase Posted May 27, 2015 Share Posted May 27, 2015 +1 for Corz checksum => runs on Windows, but does a very nice job of creating checksums for your files; and makes it very easy to validate them. If you have suspected corruption on a disk, it's very simple to simply run Corz and isolate any corrupted files. Quote Link to comment
FLK Posted May 27, 2015 Author Share Posted May 27, 2015 Isn't btrfs supposed to provide some sort of protection against that ? I really need to get more familiar with btrfs tools and features ^^ Anyway I'll check those solutions, thanks! would be nice to have one of those builtin, with a webUI integration Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.