Jump to content

Parity and data disk disabled at the same time


0rca
Go to solution Solved by JorgeB,

Recommended Posts

Hi all,

 

two days ago at night, just before going to bed, I realized that my array was off-line and that one parity and one data disk had been disabled at the same time. I was tired and decided to look after everything the next morning and shut down the server. I know now that that was a mistake, because that way I deleted the syslog :(

I have checked everything I could since then, but can only tell that both drives in question are fine and passed the SMART extended self-test.

My question now is, what is the safest way to restore these drives. Do I unassign the data disk first and then rebuild it unto itself? Do I start with the parity, since it is the faster drive and the rebuild will be quicker? Or am I thinking wrong and my situation calls for a completely different approach?  I'll attach the diagnostics and hope they are sufficient even though the relevant syslog is overwritten.

 

Thanks in advance for any help.

Cheers,

 

Michael

deathstar-diagnostics-20220914-0949.zip deathstar-smart-20220915-0944.zip

Link to comment

I am not sure if it is preferred to create a new topic or to post in this one for continuity purposes. But I need help again, this time for real.

 

I had Unraid rebuild the array and after 12 hours everything was back to normal. Yay!

 

This morning I did the update to 6.11.0-rc5 and rebooted (saving the diagnostics beforehand, just in case). Everything worked normally for while and then suddenly at 11:00 am I get read errors an ALL 16 disks. I was out-of-office, so I see this only now.

 

Attached are both diagnostic files, the one from this morning after the update and before the reboot and one I did just now before taking the array off-line.

 

I hope there's no reason for panic and would appreciate any help.

Cheers,

 

Michael

deathstar-diagnostics-20220916-0912.zip deathstar-diagnostics-20220916-1813.zip

Link to comment

That's odd Orca... I too also have a parity and data disk drop from the array over night... I rebuilt both of them with some spare disks that I have... 

 

Both of the disks SMART report looks ok... 

 

Im not going to say that this might be bug unless more reports come up... but timing between your failure and mine is odd

 

tower-diagnostics-20220916-1203.ziptower-diagnostics-20220915-0800.zip

Link to comment

Just FYI, It happened again today, all disk showed errors and the HBA was gone. I caught it just in time, went to the basement and measured temps. On the HBA heatsink it showed close to 70 degrees Celsius, so the die temperature would be even higher. Not good.

I've added some active cooling to the HBA and booted back up. This time Parity 1 and Disk 4 were disabled. I am now rebuilding, hoping that my cooling is now sufficient.

 

I have two question, to better understand the situation: Is it normal, that two disks (1 parity and 1 data) are disabled, because in that specific moment, when the HBA crashes there's bound to always be one data drive with I/O and one parity or am I simply lucky that it is just two, but could easily be more?

 

Theoretically, what would happen in the latter case? Assuming that only a few bytes might actually be wrong, would there be a way to restore the rest of the data or would I have to copy the data from each drives somewhere else to save it? Having been a raid user for decades, the whole Unraid concept is still new to me....

Link to comment
17 hours ago, 0rca said:

Is it normal, that two disks (1 parity and 1 data) are disabled, because in that specific moment, when the HBA crashes there's bound to always be one data drive with I/O and one parity or am I simply lucky that it is just two, but could easily be more?

It can disable any the disks, parity or data, which one(s) it's luck of the draw, but it won't disabled more disks than there are parity drives.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...