Michael_P Posted July 24, 2023 Share Posted July 24, 2023 My cache drive decided to go read only this morning, not sure exactly why. A reboot brought it back online but I'm unsure as to what to diagnose to avoid it happening again. At the moment, I've decided to move everything off of it and re-format it just to be safe. Coincidentally, perhaps, but I've recently updated to 6.12.3 and the system was rock solid on 6.11.1 Diagnostics before and after reboot attached Quote Jul 24 04:32:08 URServer kernel: nvme nvme0: I/O 293 (I/O Cmd) QID 3 timeout, aborting Jul 24 04:32:11 URServer kernel: nvme nvme0: I/O 824 (I/O Cmd) QID 7 timeout, aborting Jul 24 04:32:12 URServer kernel: nvme nvme0: I/O 97 (I/O Cmd) QID 1 timeout, aborting Jul 24 04:32:22 URServer kernel: nvme nvme0: I/O 527 (I/O Cmd) QID 8 timeout, aborting Jul 24 04:32:27 URServer kernel: nvme nvme0: I/O 528 (I/O Cmd) QID 8 timeout, aborting Jul 24 04:32:33 URServer kernel: nvme nvme0: I/O 590 (I/O Cmd) QID 10 timeout, aborting Jul 24 04:32:38 URServer kernel: nvme nvme0: I/O 522 (I/O Cmd) QID 9 timeout, aborting Jul 24 04:32:38 URServer kernel: nvme nvme0: I/O 293 QID 3 timeout, reset controller Jul 24 04:33:09 URServer kernel: nvme nvme0: I/O 7 QID 0 timeout, reset controller Jul 24 04:34:00 URServer kernel: nvme nvme0: Device not ready; aborting reset, CSTS=0x1 Jul 24 04:34:00 URServer kernel: nvme0n1: I/O Cmd(0x2) @ LBA 3460415872, 8 blocks, I/O Error (sct 0x3 / sc 0x71) Jul 24 04:34:00 URServer kernel: I/O error, dev nvme0n1, sector 3460415872 op 0x0:(READ) flags 0x80700 phys_seg 1 prio class 2 Jul 24 04:34:00 URServer kernel: nvme nvme0: Abort status: 0x371 ### [PREVIOUS LINE REPEATED 6 TIMES] ### Jul 24 04:34:20 URServer kernel: nvme nvme0: Device not ready; aborting reset, CSTS=0x1 Jul 24 04:34:20 URServer kernel: nvme nvme0: Removing after probe failure status: -19 Jul 24 04:34:41 URServer kernel: nvme nvme0: Device not ready; aborting reset, CSTS=0x1 Jul 24 04:34:41 URServer kernel: BTRFS error (device nvme0n1p1): bdev /dev/nvme0n1p1 errs: wr 3, rd 1, flush 0, corrupt 0, gen 0 ### [PREVIOUS LINE REPEATED 1 TIMES] ### Jul 24 04:34:41 URServer kernel: nvme0n1: detected capacity change from 3907029168 to 0 Jul 24 04:34:41 URServer kernel: BTRFS error (device nvme0n1p1): bdev /dev/nvme0n1p1 errs: wr 4, rd 1, flush 0, corrupt 0, gen 0 Jul 24 04:34:41 URServer kernel: BTRFS error (device nvme0n1p1): bdev /dev/nvme0n1p1 errs: wr 4, rd 3, flush 0, corrupt 0, gen 0 Jul 24 04:34:41 URServer kernel: BTRFS error (device nvme0n1p1): bdev /dev/nvme0n1p1 errs: wr 4, rd 2, flush 0, corrupt 0, gen 0 Jul 24 04:34:41 URServer kernel: BTRFS error (device nvme0n1p1): bdev /dev/nvme0n1p1 errs: wr 4, rd 4, flush 0, corrupt 0, gen 0 Jul 24 04:34:41 URServer kernel: BTRFS error (device nvme0n1p1): bdev /dev/nvme0n1p1 errs: wr 5, rd 4, flush 0, corrupt 0, gen 0 Jul 24 04:34:41 URServer kernel: BTRFS error (device nvme0n1p1): bdev /dev/nvme0n1p1 errs: wr 7, rd 4, flush 0, corrupt 0, gen 0 Jul 24 04:34:41 URServer kernel: BTRFS error (device nvme0n1p1): bdev /dev/nvme0n1p1 errs: wr 8, rd 4, flush 0, corrupt 0, gen 0 Jul 24 04:34:41 URServer kernel: I/O error, dev loop2, sector 27452952 op 0x0:(READ) flags 0x80700 phys_seg 2 prio class 2 Jul 24 04:34:41 URServer kernel: I/O error, dev loop2, sector 27452976 op 0x0:(READ) flags 0x80700 phys_seg 39 prio class 2 Jul 24 04:34:41 URServer kernel: I/O error, dev loop2, sector 36998496 op 0x0:(READ) flags 0x1000 phys_seg 2 prio class 2 Jul 24 04:34:41 URServer kernel: I/O error, dev loop2, sector 8759376 op 0x0:(READ) flags 0x80700 phys_seg 1 prio class 2 Jul 24 04:34:41 URServer kernel: BTRFS: error (device nvme0n1p1: state A) in __btrfs_free_extent:3070: errno=-5 IO failure Jul 24 04:34:41 URServer kernel: I/O error, dev loop2, sector 27453296 op 0x0:(READ) flags 0x80700 phys_seg 21 prio class 2 Jul 24 04:34:41 URServer kernel: BTRFS info (device nvme0n1p1: state EA): forced readonly urserver-diagnostics-20230724-0605.zip urserver-diagnostics-20230724-0704.zip Quote Link to comment
itimpi Posted July 24, 2023 Share Posted July 24, 2023 According to the diagnostics the Samsung SSD dropped offline as it does not show up in the SMART information part of the diagnostics. Unclear whether this is a real device issue, or something else. I would check that it is well seated. Regardless you should run a check filesystem on it as it is likely that some level of file system corruption has occurred. Quote Link to comment
Michael_P Posted July 24, 2023 Author Share Posted July 24, 2023 26 minutes ago, itimpi said: According to the diagnostics the Samsung SSD dropped offline as it does not show up in the SMART information part of the diagnostics. It was still online in some fashion as I could still read the files on the drive, just not write Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.