Server went down. Log files posted.


Recommended Posts

I woke up this morning to what looks like a bunch of errors.. I tried to stop the array but it seems to be hanging. I Then tried to shut it down to restart it but also seems to be hanging. So i grabbed the diagnostic files an am going to post them here.

 

I've delt with a few situation similar to this a few times now, but always tried to just tackle it myself and always caused more work in the end. So just going to post up the logs and get some guidance. (I did Tools>Diagnostics>zip file). Let me know if you need something else. 

fs03-diagnostics-20240326-1033.zip

Link to comment
Mar 24 18:05:42 FS03 kernel: ata26.00: qc timeout (cmd 0xec)
Mar 24 18:05:42 FS03 kernel: ata24.00: qc timeout (cmd 0xec)
Mar 24 18:05:42 FS03 kernel: ata23.00: qc timeout (cmd 0xec)
Mar 24 18:06:02 FS03 kernel: sas: Internal abort: timeout 50010860005786c1
Mar 24 18:06:02 FS03 kernel: ata24.00: failed to IDENTIFY (I/O error, err_mask=0x4)
Mar 24 18:06:02 FS03 kernel: ata24.00: revalidation failed (errno=-5)
Mar 24 18:06:02 FS03 kernel: sas: Internal abort: timeout 50010860005786c3
Mar 24 18:06:02 FS03 kernel: ata24: hard resetting link
Mar 24 18:06:02 FS03 kernel: ata26.00: failed to IDENTIFY (I/O error, err_mask=0x4)
Mar 24 18:06:02 FS03 kernel: ata26.00: revalidation failed (errno=-5)
Mar 24 18:06:02 FS03 kernel: sas: Internal abort: timeout 50010860005786c0
Mar 24 18:06:02 FS03 kernel: ata23.00: failed to IDENTIFY (I/O error, err_mask=0x4)
Mar 24 18:06:02 FS03 kernel: ata23.00: revalidation failed (errno=-5)
Mar 24 18:06:12 FS03 kernel: ata26.00: qc timeout (cmd 0xec)
Mar 24 18:06:12 FS03 kernel: ata24.00: qc timeout (cmd 0xec)
Mar 24 18:06:12 FS03 kernel: ata23.00: qc timeout (cmd 0xec)
Mar 24 18:06:33 FS03 kernel: sas: Internal abort: timeout 50010860005786c1
Mar 24 18:06:33 FS03 kernel: sas: Internal abort: timeout 50010860005786c3
Mar 24 18:06:33 FS03 kernel: ata24.00: failed to IDENTIFY (I/O error, err_mask=0x4)
Mar 24 18:06:33 FS03 kernel: ata26.00: failed to IDENTIFY (I/O error, err_mask=0x4)
Mar 24 18:06:33 FS03 kernel: ata26.00: revalidation failed (errno=-5)
Mar 24 18:06:33 FS03 kernel: ata24.00: revalidation failed (errno=-5)
Mar 24 18:06:33 FS03 kernel: ata24: hard resetting link
Mar 24 18:06:33 FS03 kernel: sas: Internal abort: timeout 50010860005786c0
Mar 24 18:06:33 FS03 kernel: ata23.00: failed to IDENTIFY (I/O error, err_mask=0x4)
Mar 24 18:06:33 FS03 kernel: ata23.00: revalidation failed (errno=-5)
Mar 24 18:07:03 FS03 kernel: ata26.00: qc timeout (cmd 0xec)
Mar 24 18:07:03 FS03 kernel: ata24.00: qc timeout (cmd 0xec)
Mar 24 18:07:03 FS03 kernel: ata23.00: qc timeout (cmd 0xec)
Mar 24 18:07:24 FS03 kernel: sas: Internal abort: timeout 50010860005786c3
Mar 24 18:07:24 FS03 kernel: ata26.00: failed to IDENTIFY (I/O error, err_mask=0x4)
Mar 24 18:07:24 FS03 kernel: ata26.00: revalidation failed (errno=-5)
Mar 24 18:07:24 FS03 kernel: ata26.00: disable device
Mar 24 18:07:24 FS03 kernel: sas: Internal abort: timeout 50010860005786c0
Mar 24 18:07:24 FS03 kernel: sas: Internal abort: timeout 50010860005786c1
Mar 24 18:07:24 FS03 kernel: ata24.00: failed to IDENTIFY (I/O error, err_mask=0x4)
Mar 24 18:07:24 FS03 kernel: ata23.00: failed to IDENTIFY (I/O error, err_mask=0x4)
Mar 24 18:07:24 FS03 kernel: ata23.00: revalidation failed (errno=-5)
Mar 24 18:07:24 FS03 kernel: ata24.00: revalidation failed (errno=-5)
Mar 24 18:07:24 FS03 kernel: ata23.00: disable device
Mar 24 18:07:24 FS03 kernel: ata24.00: disable device

 

Looks more like a controller problem, several disks dropped offline at the same time, it can also be a power splitter shared by all devices.

Link to comment

Yes, two disks are disabled the Parity 2 and Disk 1. I'm running SMART tests on them now. But if all looks "fine" what's the next step in bringing them back online from "disabled"?

 

Stop array> Take the drives out of the "config" by setting them to "NO Device" > Start Array > Stop Array > Put drives back into "config" > Start Array > Rebuild Array

 

Am I thinking that correctly? 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.