Rebuild super slow - keeps going in fits & starts


YsarKain
Go to solution Solved by YsarKain,

Recommended Posts

Unraid 6.11.5

 

So one of my old drives died, I replaced it with a drive that ended up being DOA.

 

When I removed the drive and went to recheck everything one of the other drives (the 16T toshiba) wasn't sitting right , causing it to be disabled.

 

So now it's trying to rebuild that drive back, the other one is missing while I wait for a replacement to arrive, and it's going ridiculously slow.  It will speed up for a second or two, maybe a minute, then it drops to 0.0 write on the drive.  It's not throwing any errors, but I cant spend half a year waiting for it to finish at the current rate.

 

I have the original 4TB drive, and it looks like the data is intact, maybe it was a problem, but because I tried to replace it with the 14tb DOA drive, I can't put it back in.

 

Is there anything I can do?

 

unraid-smart-20230318-1317.zip unraid-smart-20230318-1309.zip unraid-diagnostics-20230318-1307.zip

Link to comment

So things are getting worse.  The Toshiba drive failed and got kicked out again, and when I tried mounting it  in unassigned devices to see if the data was still there, it threw some I/O errors.

 

The 4TB drive that was the problem originally seems to mount fine in unassigned devices and I was able to run a diff to see the differences between the emulated drive and the disk.

 

Maybe I'm grasping at straws here, but if I use unBalance to move everything off the failed drives, then do New Config with both those drives left out should that work?

Link to comment

I put the 14 tb drive that I thought was DOA back in, and it seems to be working now, but there's something screwy going on.

 

The rebuild of disks 10 and 22 is going but very slowly.  The appdata share has disappeared from the shares list, but I can see it in the shell at /mnt/user/appdata.

 

When I looked in /mnt disk22 is saying I/O error (screenshot)

 

I've attached a new diag file.

 

Thanks so much for looking at this, I'm going nuts trying to figure out what's going on.  The 16TB drive is four months old, well under warranty but it looks like Toshiba's support is absolute crap - I have to send the drive back and then wait for a prepaid visa in the mail.

2023-03-19 14_16_00-root@Unraid__mnt _ bash --login (Unraid).png

unraid-diagnostics-20230319-1414.zip

Link to comment

Another update - rebuild time is now up to 100 days

 

I see the drive activity lights coming on across the board for a few seconds then going back to idle, over and over again.  No errors reported on the dashboard.

 

The appdata and domain shares have disappeared from the list.  The dockers that I do have turned on seem to be ok except for MariaDB which refuses to start.  The log shows I/O errors trying to read/write to /mnt/user/appdata.  appdata and domains were both set to cache only.  There doesn't appear to be any issue with the cache drives which are set to RAID10.  I can access the files through /mnt/nucache no problem.

 

Normally on a rebuild or parity check I'll get 60-90 MB/s and it takes 2-3 days.  Now it's going around 2 MB/s.

 

I grabbed the backup file from the database and set it up on a VM on another host, so I don't need to worry about that not working.  

 

Some shares are reporting 0 free space while others are showing the 84 TB free on the array.  The shares reporting zero are not accessible from my proxmox server (array and isos)

 

None of these shares use the cache except for one named uploads, which seems to be working fine.  Array is where I was storing proxmox backups, isos is where I store boot disk images for proxmox (and unraids vms)  so this is more of an inconvenience than a problem - except in how it could be a sign of bigger problems.

 

1867174101_2023-03-1915_36_43-Unraid_Shares.thumb.png.751028ef9fe33d74e2be75510529284a.png

Edited by YsarKain
Link to comment
  • Solution

the system became completely unresponsive, so I shut it down.  I then removed disk22 completely, leaving the slot open.

 

The missing shares are back, I can access all the shares from the network.

 

Share sizes are correct, and in /mnt all the disks are reporting normally.

 

I have to assume the Toshiba 16TB drive was the culprit, and it was somehow screwing with the rest of the array, even when it was not part of the array, since removing it has returned everything more or less to normal.

 

Rebuild is now causing constant drive activity lights, not intermittent, and the speed is still slow (~30 MB/s) but I have a lot of hashing going on in the background because of the shutdown.  Hopefully that will clear up in a few hours and rebuild speed will get back to 60-90 - but even if it doesn't, five days for rebuild is a hell of a lot better than 150.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.