2 SATA's fail at same under 6.9.2


Recommended Posts

4 hours ago, GeorgeJetson20 said:

changed the disk assignment from disk 2 (still empty there) and put it in disk 14.  It's a doing clean disk now.

Why didn't you wait for further advice?

 

Not only does that mean you have no chance to recover anything from the physical disk, but it makes things more complicated to get your array back to normal, since you will have an additional unformatted disk in the array and nothing in slot2, which is still disabled and expecting to be rebuilt. 

 

The next step would have been to try to repair the filesystem on the physical disk so it could be mounted as an Unassigned Device. Somewhat similar to what you did with repairing the emulated disk. Perhaps the repaired filesystem on the physical disk wouldn't have had as much lost+found, if any.

 

Too late now. You will just have to live with the lost+found. Have you tried the linux 'file' command?

 

As for your large number of mostly empty disks, it would make more sense to not have all that installed until needed for capacity. Maybe New Config without all those and rebuilding both parity is the way forward now.

Link to comment

Emulated disk2 seems to have little if any data on it. Did you move the data off?

 

Looks like disks 6-13 may be empty, and of course new disk14 hasn't been formatted yet.

 

2 hours ago, trurl said:

Maybe New Config without all those and rebuilding both parity is the way forward now.

 

Link to comment
6 minutes ago, trurl said:

Emulated disk2 seems to have little if any data on it. Did you move the data off?

Diagnostics after repair had about 5T of data so you must have moved that data elsewhere. Right?

 

Looks like only disks 1, 3, 5 have any significant amount of data currently.

 

Go to User Shares, click Compute All at bottom, wait for the results. If it doesn't produce complete results for all shares after a few minutes, refresh the page. Then post a screenshot.

Link to comment

Yes, moved everything from disk2 and already got about 80% of the files, but still some data to go thru. In case it wasn't clear, disk 14 = Disk2 and it's got 11 more hours to finish the disk clean.   Once that is added.   I'm really not clear or sure of myself to not make a mistake with new config.  So once the clear finishes, what do i do next when the drive is back online.  As you are saying remove some of the 14tb's if i understand you (just take them out of the array)?   Eventually I wanted to fill the other disks up with a large movie collection but like i said, these first 6 months was kind of a trial to see how it behaved and then slowly move data to this from the storage from QNAP's.   Then back up this data to my other server a 5950x with simular configuration but only 4x14's with 1 parity rigjht now.

Link to comment
5 minutes ago, GeorgeJetson20 said:

In case it wasn't clear, disk 14 = Disk2 and it's got 11 more hours to finish the disk clean

I understand completely. Problem is the array, as currently defined, still has a disk2 which needs rebuilding, in addition to that new disk14. The only way to get it to forget about either of those is New Config and rebuild both parity. Might as well remove other unused disks while your at it.

 

 

Link to comment
10 minutes ago, GeorgeJetson20 said:

I'm really not clear or sure of myself to not make a mistake with new config

New Config has an option to retain all assignments. After you Apply, it will let you make any assignment changes you want before starting the array.

 

As long as you make sure that no data disk is assigned to any parity slot, and all data disks you want to include in the array are assigned to data slots, you should be good. New Config will only write to parity disks, making them in sync with all the other disks in the array.

 

https://wiki.unraid.net/Manual/Tools#New_Config

Link to comment

trurl,

 

NEVER MIND BELOW - I figured out new config.   I kept the pool disks and its in process of rebuild but when it started i got the following message about parity disk 2.  Here are the messages:  (This is going take DAYS so i will leave you alone until it's done and post another diagnostics at that same.   Sound good?

image.png.5130fde3703f198a56369cd36993d8df.png

 

 

Sorry I'm asking so many questions..   I removed some disks from array and went to new config and i would only choose to preserve the POOL slots, is this correct?  Didnt do it yet.   just not sure if i preserve array.  just not sure.  so waiting before starting.

Edited by GeorgeJetson20
Link to comment

Did you read the wiki I linked?

Quote

In most cases selecting the option to keep all assignments is the best choice as it puts you in a state where you just need make any desired changes from your current assignments.

 

Have you decided what changes you are going to make?

Link to comment
25 minutes ago, GeorgeJetson20 said:

it's done and post another diagnostics

Let it run a while to see how it's going. If something seems wrong post new diagnostics.

 

What you should see on Main - Array Devices is lots of writes to parity, lots of reads from all other array disks, zero in the Errors column for all disks.

 

That screenshot has more writes than I would expect on disk2. Are you writing to it? Maybe those numbers don't reset on New Config. You can reset them by clicking the Clear Stats button on Main - Array Operation.

 

Wouldn't hurt to disable Docker and VM Manager in Settings, and not write to the array until parity sync completes. Technically it shouldn't cause any problems, but it will perform better without extra disk activity.

 

Not sure if you are using the word "pool" the way it is meant in Unraid. The disks in the parity array are the "array". In addition to the array, you can have multiple "pools" such as cache that are not part of the array, but are part of user shares. "Pool" has a more generic meaning in other contexts to just mean a group of disks considered together for some purpose, but better to stick to the more specific Unraid way of thinking.

 

 

Link to comment

Nice calm less stressful day today.   It's currently at 75% of the process.   Did the clear on the drive stats so all the drives except for one I am writing to is about the same writes now.   Had a terrible lightning storm this afternoon and lost power for a 5 minutes, but no issue with UPS.   But got my heart racing with "what if the power doesn't come back" but it did.  Once the array is back to full, i will spin down all the drives I removed.  I forgot about that last night.  Recovered all the missing data from the drives and back in business as far as that.

 

Trurl, i really appreciate the help you gave me - I would have ended up probaly losing the data and just starting fresh not knowing any better.   There are tons youtube videos but when your in panic, you don't think clearly :)    

 

I will keep you updated once the last step is done and post diagostics.zip

  • Like 1
Link to comment
  • 2 weeks later...
  • 4 weeks later...

Everything has been running good for the last month but one thing that i can not figure out is why the cache will not get moved to the array anymore.  it's at 37.1g and normally  the mover would take care of it but it's stopped and not totally sure why.   Here is my diagnostics.   Maybe you will catch something that I am not aware of.   

tower-diagnostics-20220827-1314.zip

Link to comment

Your appdata, domains, system shares have files on the array, and are configured to be moved to the array. You want these shares all on fast pool (cache) and configured to stay there, so Docker/VM performance isn't impacted by slower array, and so array disks can spin down since these files are always open.

 

These are the only shares I looked at since you have so many .cfg files in config/shares folder and 6.9.2 diagnostics don't give a summary of your shares so I would have to open them all one at a time.

 

Go to User Shares page, click Compute All button at bottom, wait for the complete results. Complete results will show how much of each disk each user share is using. If you don't get complete results after a few minutes, refresh the page. Then post a screenshot

Link to comment
On 8/27/2022 at 1:19 PM, GeorgeJetson20 said:

can not figure out is why the cache will not get moved to the array

As I was explaining in my previous post, there are some things you want to stay on cache. That screenshot shows what I was talking about

On 8/27/2022 at 1:46 PM, trurl said:

Your appdata, domains, system shares have files on the array, and are configured to be moved to the array. You want these shares all on fast pool (cache) and configured to stay there, so Docker/VM performance isn't impacted by slower array, and so array disks can spin down since these files are always open.

Set appdata, domains, system shares to cache:prefer so they can be moved to cache. The names of some of your other shares suggest they are also involved in VMs, so you might want them cache:prefer also.

 

Nothing can move open files. Disable Docker and VM Manager in Settings and then run Mover.

 

Mover won't move duplicates so there may still be some manual cleanup needed.

 

You might want to backup some of that to the array, there are plugins (CA Backup, VM Backup) to help with some of that.

 

 

Link to comment

Did all of the items you mentioned.   After making the Cache Prefer, the size got larger (39g) but then today it dropped to (15g) 

With exception of VM Backup.  Did not seem to work correctly (or at all).    Took a look at it and snapshot ability but after trying it, it did not work so I gave up.

 

I always thought the cache was to temp hold the information and each time mover runs, to move to the array so you are only in danger of losing something until mover runs.

Link to comment
  • 2 months later...

I can't believe it but it seems like 2 14tb's failed at the same time.   I been running smooth without any issues for a long period since the other issue back in July but at 11:43am today both drives (disk 3 and 1 of the parity drives) failed.   I have not done anything yet but here is the diagnostics.zip file seeking some help again.   Not sure if there is another issue going on because both drives stopped at same time with 'disk in error state (disk dsbl)' both have same message and when i click on the drives they both say 'a mandatory smart command failed'  - I have been using Urbackup and have many large disk images on the server and i don't want to lose them.

tower-diagnostics-20221122-1743.zip

Edited by GeorgeJetson20
Link to comment

I ran a xfs repair on disk 3. Updated to latest version on unraid (I heard about some issues with the Fan plugin, which i had working good so that is why i hesitated to upgrade this one.   But where do i go from here?   Do i need to replace the hard drives?  Both Parity 1 and Disk 3 failed.   Not sure what to do next.  I have 1 spare 14tb 

will wait for more instructions

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.