September 6, 20214 yr I have a cache pool made up of two SSD 480gb SSD drives in a raid 1. Over the last few days I have noticed that my trim cron job has errored. Just looked at the server and the log file is nearly full. It is full of entries such as : Sep 6 21:01:19 PCServer kernel: sd 10:0:0:0: [sdi] tag#28 UNKNOWN(0x2003) Result: hostbyte=0x04 driverbyte=0x00 cmd_age=0s Sep 6 21:01:19 PCServer kernel: sd 10:0:0:0: [sdi] tag#28 CDB: opcode=0x2a 2a 00 11 52 0f 00 00 00 80 00 Sep 6 21:01:19 PCServer kernel: blk_update_request: I/O error, dev sdi, sector 290590464 op 0x1:(WRITE) flags 0x1800 phys_seg 16 prio class 0 Sep 6 21:01:19 PCServer kernel: BTRFS warning (device sdf1): lost page write due to IO error on /dev/sdi1 (-5) Sep 6 21:01:19 PCServer kernel: BTRFS warning (device sdf1): lost page write due to IO error on /dev/sdi1 (-5) Sep 6 21:01:19 PCServer kernel: BTRFS warning (device sdf1): lost page write due to IO error on /dev/sdi1 (-5) Sep 6 21:01:19 PCServer kernel: BTRFS error (device sdf1): error writing primary super block to device 2 Sep 6 21:01:24 PCServer kernel: scsi_io_completion_action: 53 callbacks suppressed Sep 6 21:01:24 PCServer kernel: sd 10:0:0:0: [sdi] tag#4 UNKNOWN(0x2003) Result: hostbyte=0x04 driverbyte=0x00 cmd_age=0s Sep 6 21:01:24 PCServer kernel: sd 10:0:0:0: [sdi] tag#4 CDB: opcode=0x2a 2a 00 00 12 9c 88 00 00 18 00 Sep 6 21:01:24 PCServer kernel: print_req_error: 54 callbacks suppressed Sep 6 21:01:24 PCServer kernel: blk_update_request: I/O error, dev sdi, sector 1219720 op 0x1:(WRITE) flags 0x0 phys_seg 3 prio class 0 I have a feeling that the second cache drive is failing? I can not also undertake a smart report. I would say the drive is failing or has failed? Unraid is not showing any errors in the 'Main tab'. The drive in question has a lot more reads and writes. As part of a cache pool, I should be able to stop the array. Unassign the drive, power down the server, install new SSD, assign it and it should rebuild? Any help or advice is greatly appreciated.
September 6, 20214 yr Community Expert Go to Tools - Diagnostics and attach the complete Diagnostics ZIP file to your NEXT post in this thread.
September 7, 20214 yr Author I checked all the connections and made sure they were all ok. Rebooted server and can do a SMART test now. Diagnostics attached. Also screenshot of the disk status and I think this disk is failing. pcserver-diagnostics-20210907-1823.zip
September 7, 20214 yr Community Expert Not entirely sure how to interpret SMART for SSDs. Run an extended SMART test on the drive.
September 7, 20214 yr Community Expert Post new diagnostics or at least new SMART report for that disk
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.