kevin_h Posted January 4, 2018 Share Posted January 4, 2018 This is my first unRaid server so I am not that familiar with file systems other than NTFS so I am hoping someone can clear my thinking real quick. I have a 10-drive cache pool of 300GB 10k SAS drives for a total of 1.5TB of cache inside a Supermicro SC847 chassis with 10 data disk and dual parity. The other day I ran a balance and scrub on the cache and got some unrecoverable errors so I started looking more and saw that I have some BTRFS errors in the disk logs. Since the cache pool is a RAID 1 setup I'm just a bit confused as to which disk is actually failing. I ordered 2 new disks as replacements but I want to make sure I understand the error. When I click on the Disk Log for sdr, I get: Jan 4 06:04:17 FS kernel: blk_update_request: critical target error, dev sdr, sector 148276291 Jan 4 06:04:17 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 777, flush 0, corrupt 0, gen 0 Jan 4 06:04:17 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 778, flush 0, corrupt 0, gen 0 Jan 4 06:04:17 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 779, flush 0, corrupt 0, gen 0 Jan 4 06:04:17 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 780, flush 0, corrupt 0, gen 0 Jan 4 06:04:17 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 781, flush 0, corrupt 0, gen 0 Jan 4 06:04:17 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 782, flush 0, corrupt 0, gen 0 Jan 4 06:04:17 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 783, flush 0, corrupt 0, gen 0 Jan 4 06:04:17 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 784, flush 0, corrupt 0, gen 0 Jan 4 06:04:17 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 785, flush 0, corrupt 0, gen 0 Jan 4 06:04:17 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 786, flush 0, corrupt 0, gen 0 But when I click on the Disk Log for sdq, I get: Jan 4 06:00:54 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 751, flush 0, corrupt 0, gen 0 Jan 4 06:00:54 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 752, flush 0, corrupt 0, gen 0 Jan 4 06:00:54 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 753, flush 0, corrupt 0, gen 0 Jan 4 06:00:54 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 754, flush 0, corrupt 0, gen 0 Jan 4 06:00:54 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 755, flush 0, corrupt 0, gen 0 Jan 4 06:00:54 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 756, flush 0, corrupt 0, gen 0 Jan 4 06:00:54 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 757, flush 0, corrupt 0, gen 0 Jan 4 06:00:54 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 758, flush 0, corrupt 0, gen 0 Jan 4 06:02:32 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 759, flush 0, corrupt 0, gen 0 Jan 4 06:02:32 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 760, flush 0, corrupt 0, gen 0 Jan 4 06:02:32 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 761, flush 0, corrupt 0, gen 0 Jan 4 06:02:32 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 762, flush 0, corrupt 0, gen 0 Jan 4 06:02:32 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 763, flush 0, corrupt 0, gen 0 Jan 4 06:02:32 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 764, flush 0, corrupt 0, gen 0 Jan 4 06:02:32 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 765, flush 0, corrupt 0, gen 0 Jan 4 06:02:36 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 766, flush 0, corrupt 0, gen 0 Jan 4 06:02:37 FS kernel: BTRFS info (device sdq1): read error corrected: ino 17489478 off 9459154944 (dev /dev/sdr1 sector 147029536) Jan 4 06:02:42 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 767, flush 0, corrupt 0, gen 0 Jan 4 06:02:42 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 768, flush 0, corrupt 0, gen 0 Jan 4 06:02:42 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 769, flush 0, corrupt 0, gen 0 Jan 4 06:02:42 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 770, flush 0, corrupt 0, gen 0 Jan 4 06:03:59 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 771, flush 0, corrupt 0, gen 0 Jan 4 06:03:59 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 772, flush 0, corrupt 0, gen 0 Jan 4 06:03:59 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 773, flush 0, corrupt 0, gen 0 Jan 4 06:03:59 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 774, flush 0, corrupt 0, gen 0 Jan 4 06:03:59 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 775, flush 0, corrupt 0, gen 0 Jan 4 06:03:59 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 776, flush 0, corrupt 0, gen 0 Jan 4 06:04:17 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 777, flush 0, corrupt 0, gen 0 Jan 4 06:04:17 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 778, flush 0, corrupt 0, gen 0 Jan 4 06:04:17 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 779, flush 0, corrupt 0, gen 0 Jan 4 06:04:17 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 780, flush 0, corrupt 0, gen 0 Jan 4 06:04:17 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 781, flush 0, corrupt 0, gen 0 Jan 4 06:04:17 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 782, flush 0, corrupt 0, gen 0 Jan 4 06:04:17 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 783, flush 0, corrupt 0, gen 0 Jan 4 06:04:17 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 784, flush 0, corrupt 0, gen 0 Jan 4 06:04:17 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 785, flush 0, corrupt 0, gen 0 Jan 4 06:04:17 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 786, flush 0, corrupt 0, gen 0 So is sdr going bad or is sdq or are both on their way out? I ordered 2 so I can replace both but if only 1 is causing the errors due to the mirror I'd rather keep the second as a spare. Thanks. Link to comment
JorgeB Posted January 4, 2018 Share Posted January 4, 2018 Looks like the problem is sdr, but you can confirm by looking at the output of: btrfs dev stats /mnt/cache Link to comment
kevin_h Posted January 4, 2018 Author Share Posted January 4, 2018 8 minutes ago, johnnie.black said: Looks like the problem is sdr, but you can confirm by looking at the output of: btrfs dev stats /mnt/cache Thank you very much for the response. According to that command, it is in fact sdr that is causing the errors. Every other stat is reporting 0. btrfs dev stats /mnt/cache [/dev/sdr1].write_io_errs 0 [/dev/sdr1].read_io_errs 793 [/dev/sdr1].flush_io_errs 0 ... Link to comment
jarsever Posted January 20, 2018 Share Posted January 20, 2018 I am also having an issue with sdr1 in my cache pool. I have two disks (Samsung 840 EVO + Samsun 850 Pro) that are both 1TB in size. I am getting the same errors as above except it is saying something about sdr1 being unaligned. Is there a way to fix that? Can I just bring down the array and format them both and then recreate the cache pool? I assume it should align them to 4k sectors right? Jan 19 17:15:56 unraid kernel: sd 5:0:7:0: [sdr] Unaligned partial completion (resid=1020, sector_sz=512) [/dev/sdr1].write_io_errs 112 [/dev/sdr1].read_io_errs 136519 [/dev/sdr1].flush_io_errs 0 [/dev/sdr1].corruption_errs 0 [/dev/sdr1].generation_errs 0 Link to comment
JorgeB Posted January 20, 2018 Share Posted January 20, 2018 Can I just bring down the array and format them both and then recreate the cache pool? Reformating the pool should fix the alignment issue but those read and write errors are a hardware problem, usually a bad or flaky cable. Link to comment
jarsever Posted February 10, 2018 Share Posted February 10, 2018 Turns out that one of my three HBA cards had a bad channel. I ended up replacing the card and the disk is back to normal. I just wish the speeds were faster with double parity. Maybe I'll just reduce to single parity for the speed increase. Link to comment
Recommended Posts
Archived
This topic is now archived and is closed to further replies.