6.11.5 with 4 drives - one failed, one disabled


zylex
Go to solution Solved by trurl,

Recommended Posts

Hi all - I'm quite new to Unraid. Have a machine here that came to me with Unraid already on it, so I had a play and put a bit of data on it, then the next time I looked at it (probably months later) it was in this condition. This probably happened a year ago at a guess. I didn't have a spare drive of the right capacity at the time and it had been powered off for a while before we powered it back up and noticed this problem, so there's no useful syslog available. The data is nothing super critical but would like to regain access if possible. More importantly, I'd like to learn more about how to work with issues like this in Unraid, but had trouble finding info on what to do in this specific situation.

 

Unraid 6.11.5 on an HP Microserver Gen8, 4 drives. One failed and one disabled - as far as I know this happened at the same time.

 

Drives listed below:

image.png.548d6a604c3ef3b74a60c355553bd3ec.png

 

 I have removed and replaced Disk 3. Old disk 3 - ID shown in screenshot was no longer detected in unraid. I have tried following instructions to re-enable DIsk 2, and tried selecting the 'new' disk 3, but can't seem to get to a point where the array says anything other than 'invalid configuration' so I can start it in maintenance mode as per the instructions for either disk replacement, or for enabling a drive by rebuilding onto itself - I assume this is because of the two drives being an issue at the same time. I've run a smart test on Disk 2 - only issue seems to be it exceeded a temperature threshold once. 

 

Diagnostics attached. 

microserver1-diagnostics-20230202-1858.zip

Edited by zylex
Link to comment
8 minutes ago, zylex said:

as far as I know this happened at the same time

It wouldn't have disabled a disk if one had already failed, but it is possible one disk was disabled and then the other one failed before anyone did (or could do) anything about the disabled disk.

 

Those diagnostics can't tell us what happened in the past since syslog was reset on reboot, 

 

You haven't been able to make any progress because Unraid wants to rebuild disabled disk2, but it needs all other disks including the original and now missing disk3 to do so.

 

Are you absolutely sure the original disk3 is unusable? Could be something other than the disk, such as a bad cable or port. Do you still have it?

 

It is possible to force it to rebuild disk3 using all other disks including currently disabled disk2, but how successful that will be would depend on how out-of-sync disk2 had gotten while disabled.

 

Past my bedtime, I will check back in the morning. Someone else will probably get to this before I do. Don't do anything without further advice.

Link to comment

No problem - 'failed' disk definitely seems like a hardware failure. I removed it and put it in a usb drive dock connected to my laptop - unusual noises, undetectable, etc. if i remember correctly (it was late last year so I can attempt it again but will wait until you or others reply further before I do so). Drives are in a cage with a backplane type arrangement but i did try removing the drive to bypass that cable and plugging it directly into the motherboard - also no success. Apologies - forgot to mention that in initial post. I assume that because it detects the new drive in the same slot (and can carry out a smart test, etc) but couldn't on the old drive, that would also point to drive failure.

Edited by zylex
Link to comment
  • Solution

Follow these instructions very carefully.

  1. Tools - New Config - Retain All - Apply.
  2. Assign new disk3, leave all other disk assignments as before.
  3. Important: In Main - Array Operations, check BOTH parity valid and Maintenance mode, then start the array. This will accept all disks into the array just as they are without writing to them.
  4. Stop the array, unassign disk3, start the array in normal mode (not Maintenance) with nothing assigned as disk3. This will disable disk3 so we can check if it is emulated well enough to rebuild, or needs filesystem repair before attempting rebuild.

Then post new diagnostics

 

Link to comment

Disabled/emulated disk3 is mounted. However, it appears to be mostly empty. Is that expected?

 

Not a lot on disk2 either, but disk1 is too full. Not clear why those others weren't being used since all your shares have highwater allocation. Maybe they were new disks that hadn't been in the system long enough to get data. Disk2 would have been the next chosen and it hadn't reached highwater yet so disk3 wouldn't have any data.

 

You must set Minimum Free on each of your user shares to larger than the largest file you expect to write to the share.

 

A couple of shares that had been configured in the past no longer exist, that might be expected also since these cfg files don't get removed when a share does.

p------------e                    shareUseCache="no"      # Share does not exist
s-------g                         shareUseCache="no"      # Share does not exist

 

 

Link to comment
2 hours ago, zylex said:

parity drive has now failed a smart test

That's not really a failed smart test, it is just a smart attribute that needs to be considered. Not a lot reallocated, and disks have spare sectors just for this purpose. It is probably OK for now if it doesn't get worse.

 

If you want to run a smart test on it you can. An extended test will take several hours.

 

Might be worth doing since you are going to have to rebuild something, whether parity or disk3, and disk3 doesn't have any data so maybe you would use that new disk as parity and remove disk3.

Link to comment

Being almost empty is not surprising - I vaguely recall everything was loaded up on the first disk and there was very little on the others prior to this issue but didn't know Unraid well enough to know why.  I'll go and look at making those changes now. 

 

Should I just attempt the rebuild process, and if so do I make those changes first or after?

 

At this point I'm thinking I'll get it going, get all the data off it to something else, and then start fresh with four new drives and configure everything from scratch using best practice.

Edited by zylex
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.