TexasTitan915 Posted July 22, 2020 Posted July 22, 2020 I've been having random issues with a particular drive in my server. Disk5 will work perfectly fine one day, then after powering down the server for some time and powering it back up, it'll boot up with the drive in an error state. No read errors or anything. The first time it happened I powered down the array, removed the disk, started the array back up and then powered down and reconnected the drive. Powered back up, assigned the disk to Disk5 and rebuilt the data onto the drive. No issues while the server was up for about a week. I powered down, as part of routine maintenance, and here we are again after booting it back up. I've attached a copy of the syslog. I see instances of "Synchronize Cache" failing on sdb, then sdc, then sdd. Could this be an instance where the power supply is going bad? It's an older 500w Corsair CX500. hurtadoserver-syslog-20200722-2040.zip Quote
TexasTitan915 Posted July 22, 2020 Author Posted July 22, 2020 I swapped the power supply just to rule that out and i'm pre-clearing the drive to rule out any issues with the drive itself. Quote
Gragorg Posted July 23, 2020 Posted July 23, 2020 Run an EXTENDED SMART test on the drive if you haven't already. Quote
TexasTitan915 Posted July 27, 2020 Author Posted July 27, 2020 On 7/22/2020 at 9:39 PM, Gragorg said: Run an EXTENDED SMART test on the drive if you haven't already. I ran two preclear cycles and all was good. Currently running an extended SMART test, will post back with results. Quote
TexasTitan915 Posted July 28, 2020 Author Posted July 28, 2020 Extended SMART test finished without error. SMART report and current diagnostic reports attached. hurtadoserver-smart-20200728-0816.zip hurtadoserver-diagnostics-20200728-0819.zip Quote
JorgeB Posted July 28, 2020 Posted July 28, 2020 Disk dropped right after spin down: Jul 22 14:34:43 HurtadoServer kernel: mdcmd (45): spindown 5 Jul 22 14:35:58 HurtadoServer kernel: sd 12:0:3:0: device_block, handle(0x000b) Jul 22 14:36:00 HurtadoServer kernel: sd 12:0:3:0: device_unblock and setting to running, handle(0x000b) Jul 22 14:36:00 HurtadoServer kernel: sd 12:0:3:0: [sdi] Synchronizing SCSI cache Jul 22 14:36:00 HurtadoServer kernel: sd 12:0:3:0: [sdi] Synchronize Cache(10) failed: Result: hostbyte=0x01 driverbyte=0x00 Jul 22 14:36:00 HurtadoServer kernel: mpt2sas_cm0: removing handle(0x000b), sas_addr(0x4433221105000000) Jul 22 14:36:00 HurtadoServer kernel: mpt2sas_cm0: enclosure logical id(0x51866da06d30c200), slot(6) Set spin down to never on that drive and see if it makes a difference. Quote
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.