viviolay Posted April 16, 2023 Share Posted April 16, 2023 Hello, I’m new to managing my server and am having an issue with my logs being marked 100%. This is the 2nd time it happened and 2nd time I had to restart. upon this restart though, one of the 2 disks in my cache pool is marked as missing. I’m hoping someone can help me diagnose and fix the issue. attached is my diagnostics zip file. thank you. diagnostics-20230416-1522.zip Quote Link to comment
Solution EDACerton Posted April 16, 2023 Solution Share Posted April 16, 2023 (edited) Quote Apr 16 05:19:57 DolphinCove kernel: sd 4:0:0:0: [sdd] tag#2 UNKNOWN(0x2003) Result: hostbyte=0x04 driverbyte=DRIVER_OK cmd_age=0s Apr 16 05:19:57 DolphinCove kernel: sd 4:0:0:0: [sdd] tag#2 CDB: opcode=0x93 93 08 00 00 00 00 13 21 10 00 00 02 00 00 00 00 Apr 16 05:19:57 DolphinCove kernel: I/O error, dev sdd, sector 320933888 op 0x3:(DISCARD) flags 0x800 phys_seg 1 prio class 0 Apr 16 05:19:57 DolphinCove kernel: sd 4:0:0:0: [sdd] tag#9 UNKNOWN(0x2003) Result: hostbyte=0x04 driverbyte=DRIVER_OK cmd_age=0s Apr 16 05:19:57 DolphinCove kernel: sd 4:0:0:0: [sdd] tag#9 CDB: opcode=0x93 93 08 00 00 00 00 13 41 10 00 00 02 00 00 00 00 Apr 16 05:19:57 DolphinCove kernel: I/O error, dev sdd, sector 323031040 op 0x3:(DISCARD) flags 0x800 phys_seg 1 prio class 0 Apr 16 05:19:57 DolphinCove kernel: BTRFS error (device sdc1): bdev /dev/sdd1 errs: wr 169064811, rd 79904, flush 17680, corrupt 167154, gen 0 Apr 16 05:19:57 DolphinCove kernel: BTRFS error (device sdc1): bdev /dev/sdd1 errs: wr 169064812, rd 79904, flush 17680, corrupt 167154, gen 0 Apr 16 05:19:57 DolphinCove kernel: BTRFS error (device sdc1): bdev /dev/sdd1 errs: wr 169064813, rd 79904, flush 17680, corrupt 167154, gen 0 Apr 16 05:19:57 DolphinCove kernel: BTRFS error (device sdc1): bdev /dev/sdd1 errs: wr 169064814, rd 79904, flush 17680, corrupt 167154, gen 0 Apr 16 05:19:57 DolphinCove kernel: BTRFS error (device sdc1): bdev /dev/sdd1 errs: wr 169064814, rd 79904, flush 17681, corrupt 167154, gen 0 Apr 16 05:19:57 DolphinCove kernel: BTRFS warning (device sdc1): lost page write due to IO error on /dev/sdd1 (-5) Apr 16 05:19:57 DolphinCove kernel: BTRFS error (device sdc1): bdev /dev/sdd1 errs: wr 169064815, rd 79904, flush 17681, corrupt 167154, gen 0 Apr 16 05:19:57 DolphinCove kernel: BTRFS warning (device sdc1): lost page write due to IO error on /dev/sdd1 (-5) Apr 16 05:19:57 DolphinCove kernel: BTRFS error (device sdc1): bdev /dev/sdd1 errs: wr 169064816, rd 79904, flush 17681, corrupt 167154, gen 0 Apr 16 05:19:57 DolphinCove kernel: BTRFS warning (device sdc1): lost page write due to IO error on /dev/sdd1 (-5) Apr 16 05:19:57 DolphinCove kernel: BTRFS error (device sdc1): bdev /dev/sdd1 errs: wr 169064817, rd 79904, flush 17681, corrupt 167154, gen 0 Apr 16 05:19:57 DolphinCove kernel: BTRFS error (device sdc1): error writing primary super block to device 2 The MX500 drive is flooding the logs with failures. The drive is offline, and not reporting SMART data, so I would start with reseating/replacing the cables for it. Once the drive is stable (and assuming that it somehow didn’t get removed from the pool configuration entirely), you should be able to run a scrub to repair the data on the drive. Although, personally I'd probably just remove the MX500 from the pool, blkdiscard it, then add it again and scrub to give myself a little more confidence about the drive. That might just be me though Edited April 17, 2023 by EDACerton Quote Link to comment
viviolay Posted April 18, 2023 Author Share Posted April 18, 2023 On 4/16/2023 at 4:47 PM, EDACerton said: The MX500 drive is flooding the logs with failures. The drive is offline, and not reporting SMART data, so I would start with reseating/replacing the cables for it. Once the drive is stable (and assuming that it somehow didn’t get removed from the pool configuration entirely), you should be able to run a scrub to repair the data on the drive. Although, personally I'd probably just remove the MX500 from the pool, blkdiscard it, then add it again and scrub to give myself a little more confidence about the drive. That might just be me though Thanks for your help. I ended up turning off and back on the server and the drive showed up in the pool again. Even still, I'll go check the connectors/swap the cables. The drive is a recent-ish purchase - should I maybe look to exchanging it instead or it's more likely a cable/seating issue like you mentioned? Is there anything else that may cause this kind of error I should know? Thank you again for your help. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.