Recommended way to proceed - multiple disk failures


Recommended Posts

I don't see SMART report for disabled disk5 so no way to know its health, but emulated disk5 is mounted and seems to have its data.

 

Disk7 is empty or nearly so but maybe it is new.

 

17 minutes ago, pendo said:

another that is reporting failing in less than 24 hours

Which disk is that?

Link to comment
Posted (edited)

Disk 7 is only a 1 TB disk, and I'm unsure why it is empty, as it's been in the array for a while iirc.  Disk 5 is the disk that is emulated as it's offline. The diagnostics is the standard one from the tools menu.  Do I need to do something specific in order for it to be included in diagnostics?

 

Disk 2 is the one that says it will fail in less than 24 hours.  Seek error problems

Edited by pendo
Link to comment
19 minutes ago, pendo said:

Disk 7 is only a 1 TB disk, and I'm unsure why it is empty

Probably default Highwater Allocation hasn't gotten to it yet since it is so small.

 

DIsk5 is disconnected (or maybe dead), which is the reason it got disabled and the reason there is no SMART report for it.

 

Looks like you may have some corruption on disk6 as well.

 

Disk2 is probably too bad to allow for a good rebuild of disk5, so maybe we can see if disk5 is good enough and then manipulate things so we can rebuild disk2 instead.

 

You should be seeing SMART ( 👎 ) warnings on the Dashboard page for disk2, maybe this has been going on for a while and you didn't notice.

 

Do you have Notifications setup to alert you immediately by email or other agent as soon as a problem is detected?

 

Do you have another copy of anything important and irreplaceable? Parity is not a substitute for backups.

 

Shut down, check connections, both ends, SATA and power. Then reboot and post new diagnostics to see if we get a SMART report for disk5.

 

Link to comment

You neglected to mention you were rebuilding disk6 when all this happened.

Mar 18 16:07:59 tower kernel: md: recovery thread: recon D6 ...

 

4 minutes ago, trurl said:

maybe this has been going on for a while and you didn't notice.

 

Do you have Notifications setup to alert you immediately by email or other agent as soon as a problem is detected?

 

Link to comment

Disk 6 and parity are newer (to me) disks in the array. They're SAS drives connected to the built via the built-in lsi card on my mobo.  However, my case doesn't have slots for them, and they are mounted in urethane disk spacers in the bottom of the case, but otherwise not mounted rigidly.  

 

The tower got moved recently and all of a sudden started getting errors on disk 6 and it went offline. (Red X, emulated).  I reseated the cables and did a SMART check and thought it was fine.  However, I don't think I followed the correct procedure in getting it back into the array, as it started a complete rebuild. 

 

It was after this that I started seeing the errors for disk 2 and disk 5 dropped off.  I was under the impression that the rebuild finished alright, though.  

 

Definitely need to get more storage, I know.  I can add more SAS drives but don't have the physical space in this case, but mobo does NOT have 8088 connection on back.  Only have the 8087 mini SAS internal connections.  Ideally, it's time for complete upgrade (case, p/s, mobo, cpu, etc) but that's not in budget at this time. 

 

Here's my dashboard now

Screenshot_20240323_110733_Chrome.jpg

Link to comment
1 hour ago, pendo said:

correct procedure in getting it back into the array, as it started a complete rebuild. 

It would have started the rebuild after you reassigned the disk, or assigned a replacement.

 

That is the correct procedure. It was out-of-sync. It was disabled because a write to it failed and it had been emulated since then. That failed write and any subsequent writes to it were emulated and rebuild is the only way to recover those emulated writes. The only other way to get the array back in sync is to New Config and rebuild parity, but parity was not the problem.

 

Not clear the rebuild completed, but it might not be noticeable if you rebuilt to the same disk because any sector it hadn't rebuilt yet would still have whatever data was there before and maybe it was OK.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.