Jump to content
We're Hiring! Full Stack Developer ×

Cache Pool Disk Error


kevin_h

Recommended Posts

This is my first unRaid server so I am not that familiar with file systems other than NTFS so I am hoping someone can clear my thinking real quick.  I have a 10-drive cache pool of 300GB 10k SAS drives for a total of 1.5TB of cache inside a Supermicro SC847 chassis with 10 data disk and dual parity.

 

The other day I ran a balance and scrub on the cache and got some unrecoverable errors so I started looking more and saw that I have some BTRFS errors in the disk logs.  Since the cache pool is a RAID 1 setup I'm just a bit confused as to which disk is actually failing.  I ordered 2 new disks as replacements but I want to make sure I understand the error.

 

When I click on the Disk Log for sdr, I get:

Jan 4 06:04:17 FS kernel: blk_update_request: critical target error, dev sdr, sector 148276291
Jan 4 06:04:17 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 777, flush 0, corrupt 0, gen 0
Jan 4 06:04:17 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 778, flush 0, corrupt 0, gen 0
Jan 4 06:04:17 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 779, flush 0, corrupt 0, gen 0
Jan 4 06:04:17 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 780, flush 0, corrupt 0, gen 0
Jan 4 06:04:17 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 781, flush 0, corrupt 0, gen 0
Jan 4 06:04:17 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 782, flush 0, corrupt 0, gen 0
Jan 4 06:04:17 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 783, flush 0, corrupt 0, gen 0
Jan 4 06:04:17 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 784, flush 0, corrupt 0, gen 0
Jan 4 06:04:17 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 785, flush 0, corrupt 0, gen 0
Jan 4 06:04:17 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 786, flush 0, corrupt 0, gen 0

But when I click on the Disk Log for sdq, I get:

Jan 4 06:00:54 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 751, flush 0, corrupt 0, gen 0
Jan 4 06:00:54 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 752, flush 0, corrupt 0, gen 0
Jan 4 06:00:54 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 753, flush 0, corrupt 0, gen 0
Jan 4 06:00:54 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 754, flush 0, corrupt 0, gen 0
Jan 4 06:00:54 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 755, flush 0, corrupt 0, gen 0
Jan 4 06:00:54 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 756, flush 0, corrupt 0, gen 0
Jan 4 06:00:54 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 757, flush 0, corrupt 0, gen 0
Jan 4 06:00:54 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 758, flush 0, corrupt 0, gen 0
Jan 4 06:02:32 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 759, flush 0, corrupt 0, gen 0
Jan 4 06:02:32 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 760, flush 0, corrupt 0, gen 0
Jan 4 06:02:32 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 761, flush 0, corrupt 0, gen 0
Jan 4 06:02:32 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 762, flush 0, corrupt 0, gen 0
Jan 4 06:02:32 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 763, flush 0, corrupt 0, gen 0
Jan 4 06:02:32 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 764, flush 0, corrupt 0, gen 0
Jan 4 06:02:32 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 765, flush 0, corrupt 0, gen 0
Jan 4 06:02:36 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 766, flush 0, corrupt 0, gen 0
Jan 4 06:02:37 FS kernel: BTRFS info (device sdq1): read error corrected: ino 17489478 off 9459154944 (dev /dev/sdr1 sector 147029536)
Jan 4 06:02:42 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 767, flush 0, corrupt 0, gen 0
Jan 4 06:02:42 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 768, flush 0, corrupt 0, gen 0
Jan 4 06:02:42 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 769, flush 0, corrupt 0, gen 0
Jan 4 06:02:42 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 770, flush 0, corrupt 0, gen 0
Jan 4 06:03:59 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 771, flush 0, corrupt 0, gen 0
Jan 4 06:03:59 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 772, flush 0, corrupt 0, gen 0
Jan 4 06:03:59 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 773, flush 0, corrupt 0, gen 0
Jan 4 06:03:59 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 774, flush 0, corrupt 0, gen 0
Jan 4 06:03:59 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 775, flush 0, corrupt 0, gen 0
Jan 4 06:03:59 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 776, flush 0, corrupt 0, gen 0
Jan 4 06:04:17 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 777, flush 0, corrupt 0, gen 0
Jan 4 06:04:17 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 778, flush 0, corrupt 0, gen 0
Jan 4 06:04:17 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 779, flush 0, corrupt 0, gen 0
Jan 4 06:04:17 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 780, flush 0, corrupt 0, gen 0
Jan 4 06:04:17 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 781, flush 0, corrupt 0, gen 0
Jan 4 06:04:17 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 782, flush 0, corrupt 0, gen 0
Jan 4 06:04:17 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 783, flush 0, corrupt 0, gen 0
Jan 4 06:04:17 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 784, flush 0, corrupt 0, gen 0
Jan 4 06:04:17 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 785, flush 0, corrupt 0, gen 0
Jan 4 06:04:17 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 786, flush 0, corrupt 0, gen 0

 

So is sdr going bad or is sdq or are both on their way out?  I ordered 2 so I can replace both but if only 1 is causing the errors due to the mirror I'd rather keep the second as a spare.  Thanks.

Link to comment
8 minutes ago, johnnie.black said:

Looks like the problem is sdr, but you can confirm by looking at the output of:

 


btrfs dev stats /mnt/cache

 

Thank you very much for the response.  According to that command, it is in fact sdr that is causing the errors.  Every other stat is reporting 0.

 btrfs dev stats /mnt/cache
[/dev/sdr1].write_io_errs   0
[/dev/sdr1].read_io_errs    793
[/dev/sdr1].flush_io_errs   0
...

 

Link to comment
  • 3 weeks later...

I am also having an issue with sdr1 in my cache pool.  I have two disks (Samsung 840 EVO + Samsun 850 Pro) that are both 1TB in size.  I am getting the same errors as above except it is saying something about sdr1 being unaligned.  Is there a way to fix that?  Can I just bring down the array and format them both and then recreate the cache pool?  I assume it should align them to 4k sectors right?

 

Jan 19 17:15:56 unraid kernel: sd 5:0:7:0: [sdr] Unaligned partial completion (resid=1020, sector_sz=512)

[/dev/sdr1].write_io_errs    112
[/dev/sdr1].read_io_errs     136519
[/dev/sdr1].flush_io_errs    0
[/dev/sdr1].corruption_errs  0
[/dev/sdr1].generation_errs  0

 

Link to comment
  • 3 weeks later...

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...