March 26, 20242 yr I woke up this morning to what looks like a bunch of errors.. I tried to stop the array but it seems to be hanging. I Then tried to shut it down to restart it but also seems to be hanging. So i grabbed the diagnostic files an am going to post them here. I've delt with a few situation similar to this a few times now, but always tried to just tackle it myself and always caused more work in the end. So just going to post up the logs and get some guidance. (I did Tools>Diagnostics>zip file). Let me know if you need something else. fs03-diagnostics-20240326-1033.zip
March 26, 20242 yr Community Expert Mar 24 18:05:42 FS03 kernel: ata26.00: qc timeout (cmd 0xec) Mar 24 18:05:42 FS03 kernel: ata24.00: qc timeout (cmd 0xec) Mar 24 18:05:42 FS03 kernel: ata23.00: qc timeout (cmd 0xec) Mar 24 18:06:02 FS03 kernel: sas: Internal abort: timeout 50010860005786c1 Mar 24 18:06:02 FS03 kernel: ata24.00: failed to IDENTIFY (I/O error, err_mask=0x4) Mar 24 18:06:02 FS03 kernel: ata24.00: revalidation failed (errno=-5) Mar 24 18:06:02 FS03 kernel: sas: Internal abort: timeout 50010860005786c3 Mar 24 18:06:02 FS03 kernel: ata24: hard resetting link Mar 24 18:06:02 FS03 kernel: ata26.00: failed to IDENTIFY (I/O error, err_mask=0x4) Mar 24 18:06:02 FS03 kernel: ata26.00: revalidation failed (errno=-5) Mar 24 18:06:02 FS03 kernel: sas: Internal abort: timeout 50010860005786c0 Mar 24 18:06:02 FS03 kernel: ata23.00: failed to IDENTIFY (I/O error, err_mask=0x4) Mar 24 18:06:02 FS03 kernel: ata23.00: revalidation failed (errno=-5) Mar 24 18:06:12 FS03 kernel: ata26.00: qc timeout (cmd 0xec) Mar 24 18:06:12 FS03 kernel: ata24.00: qc timeout (cmd 0xec) Mar 24 18:06:12 FS03 kernel: ata23.00: qc timeout (cmd 0xec) Mar 24 18:06:33 FS03 kernel: sas: Internal abort: timeout 50010860005786c1 Mar 24 18:06:33 FS03 kernel: sas: Internal abort: timeout 50010860005786c3 Mar 24 18:06:33 FS03 kernel: ata24.00: failed to IDENTIFY (I/O error, err_mask=0x4) Mar 24 18:06:33 FS03 kernel: ata26.00: failed to IDENTIFY (I/O error, err_mask=0x4) Mar 24 18:06:33 FS03 kernel: ata26.00: revalidation failed (errno=-5) Mar 24 18:06:33 FS03 kernel: ata24.00: revalidation failed (errno=-5) Mar 24 18:06:33 FS03 kernel: ata24: hard resetting link Mar 24 18:06:33 FS03 kernel: sas: Internal abort: timeout 50010860005786c0 Mar 24 18:06:33 FS03 kernel: ata23.00: failed to IDENTIFY (I/O error, err_mask=0x4) Mar 24 18:06:33 FS03 kernel: ata23.00: revalidation failed (errno=-5) Mar 24 18:07:03 FS03 kernel: ata26.00: qc timeout (cmd 0xec) Mar 24 18:07:03 FS03 kernel: ata24.00: qc timeout (cmd 0xec) Mar 24 18:07:03 FS03 kernel: ata23.00: qc timeout (cmd 0xec) Mar 24 18:07:24 FS03 kernel: sas: Internal abort: timeout 50010860005786c3 Mar 24 18:07:24 FS03 kernel: ata26.00: failed to IDENTIFY (I/O error, err_mask=0x4) Mar 24 18:07:24 FS03 kernel: ata26.00: revalidation failed (errno=-5) Mar 24 18:07:24 FS03 kernel: ata26.00: disable device Mar 24 18:07:24 FS03 kernel: sas: Internal abort: timeout 50010860005786c0 Mar 24 18:07:24 FS03 kernel: sas: Internal abort: timeout 50010860005786c1 Mar 24 18:07:24 FS03 kernel: ata24.00: failed to IDENTIFY (I/O error, err_mask=0x4) Mar 24 18:07:24 FS03 kernel: ata23.00: failed to IDENTIFY (I/O error, err_mask=0x4) Mar 24 18:07:24 FS03 kernel: ata23.00: revalidation failed (errno=-5) Mar 24 18:07:24 FS03 kernel: ata24.00: revalidation failed (errno=-5) Mar 24 18:07:24 FS03 kernel: ata23.00: disable device Mar 24 18:07:24 FS03 kernel: ata24.00: disable device Looks more like a controller problem, several disks dropped offline at the same time, it can also be a power splitter shared by all devices.
March 26, 20242 yr Author What are some next steps you think? I kinda just want to "reset" it and see if it happens again. I plan on building a new server in the next year or so. So I don't want to put too much time/resources into this one. Just need it to last me until then.
March 26, 20242 yr Community Expert Reboot, that should bring the disks/controller back, there may or not be a disabled disk after.
March 26, 20242 yr Author Solution Yes, two disks are disabled the Parity 2 and Disk 1. I'm running SMART tests on them now. But if all looks "fine" what's the next step in bringing them back online from "disabled"? Stop array> Take the drives out of the "config" by setting them to "NO Device" > Start Array > Stop Array > Put drives back into "config" > Start Array > Rebuild Array Am I thinking that correctly?
March 27, 20242 yr Community Expert Yes, but make sure the emulated disk1 is mounting and contents look correct before rebuilding on top.
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.