Jump to content

6.9.2 Errors on disk in data-rebuild of another disk.


Recommended Posts

Have to explain back a couple of steps, disk 9 was not great and I had a drive I thought was better being replaced on my main NAS, rebuilding the "new" disk 9 was going at 30MB/s and I was getting low helium warnings, I thought it would be better to let it finish for some reason, so I did.  Then I replaced disk 9 yesterday with another drive that came off my main NAS, it finished rebuilding this morning, it had a normal speed and I think no major issues.  Throughout all that action I noticed that disk 7 was having issues, so this morning I replaced it with another drive.  I had some issues with rearranging power cables because I have a bunch of WD shucked drives, initially I had a bunch of drives not show up, but I didn't start the array. 

 

I finally got every drive to show up, other than disk 7, selected the new drive in that slot and started the array.  The number under ERRORS for disk 9 in the Main tab started shooting up, and I see that Log is at 100%, and this was immediately after a reboot.  It also wasn't showing that it was reading from disk 9 or writing to disk 7.

 

Is my issue filesystem related or hardware?  What is the safest next step? I can put the previous disk 7 drive back in if needed. I have a WD red in the middle of 2 pre-clears as a test, coming off my main NAS, which I can replace a drive with if needed.  Docker has been disabled previous to any of this and there are no VM's.  I paused the parity rebuild, errors are at 14 million.

 

Thank you.

flags-diagnostics-20211206-0936.zip

Link to comment
20 minutes ago, JorgeB said:

Disk9 dropped and reconnected, looks like a cable problem, replace cables and try rebuilding disk7 again.

Thank you. Disk9 wasn't touched today, however, I have been solving the 3.3 pin issue with molex to sata adapters, and I added a new adapter and I think there is only one molex to plug in to. I'm guessing Disk9 is plugged into that molex and isn't getting enough power.

 

I think therefore the solution is to try taping the pins and spreading out sata power as much as possible. Does that seem likely/possible?

Link to comment

I suspect it has been causing sporadic issues for a couple of years, I would stupidly swap out disks or sata cables that had probably already been in the mix, I even moved what seemed to be the problem set of drives to a SAS card and it continued. 

 

I just did the tape thing and all the drives showed up, so I started the rebuild,  and Disk7 shows "unmountable: not mounted".  I think the correct procedure is to let the rebuild on to disk7 continue, when finished, put the array in maintenance mode, and run the varying degrees of filesystem checks? As seen here: https://wiki.unraid.net/Check_Disk_Filesystems#Checking_and_fixing_drives_in_the_webGui .  Or am I better off replacing the new disk7 with the older one and seeing if there are errors? Or both, in that order?

 

My understanding is that something cause a filesystem issue, probably insufficient power, and that issue is present in the emulated disk7, which is being written to real disk7.  Once it is completely on disk7 and not emulated I can then hopefully get the FS check to fix it.  If the FS issue was caused by power issues then maybe the original disk7 is free of the errors?

flags-diagnostics-20211206-1622.zip

Edited by bobobeastie
Forgot diagnostics
Link to comment

The usual recommendation is to repair the emulated disk before rebuilding, but if you are rebuilding to a new disk and the original disk is still available with its contents it doesn't matter which you do first.

 

In fact, rebuilding to a new disk might be a good test of the hardware and the original disk still gives other options if it doesn't go well.

Link to comment

Rebuild finished and xfs check resulted in only a lost+found folder.  I put the old disk7 in my other unRaid box and mounted with UD and can see the contents. I'm assuming if I swap the new disk7 our for the old that I might need to create a new config and rebuild parity?  Or I could delete lost+found and write the contents from UD over my network to the array? Whats the safest option? Option 2 seems safer and less steps, ultimately I think I don't want the disk in the array.

 

Smart report attached for the old disk7 just in case. Quick version is 6 Reported uncorrect, 16 Current pending sector and 16 Offline uncorrectable.

 

EDIT:Should I run an XFS check on the drive in unassigned devices, and if so how?

 

nastheripper-smart-20211207-0929.zip

Edited by bobobeastie
question
Link to comment
On 12/7/2021 at 9:59 AM, bobobeastie said:

lost+found folder

How much is there?

 

On 12/7/2021 at 9:59 AM, bobobeastie said:

don't want the disk in the array

no

 

On 12/7/2021 at 9:59 AM, bobobeastie said:

XFS check on the drive in unassigned devices

Can you mount it in UD? If not then it won't know which filesystem check to run and you would have to do it from the command line.

 

Post new diagnostics

Link to comment

It was pretty much the whole drive which had been full. The old disk7 is mounted on another unRaid system in unassigned devices.  I gave up waiting (my problem, not blaming anyone) and deleted lost+found and started copying files over.  For anyone in a similar situation, don't do that. Copying in Krusader, I had the transfer stall in at least 1 place, and the drive is up to 65535 reported uncorrect.  I'm copying other directories and have transferred about half of the total drive, I'm going to just keep moving anything that will move. I didn't have enough space to not delete lost+found.

 

It occurred to me a good solution would be to have a script move lost+found files around using either a list of file locations saved on a cron, or from the old disk7, therefor skipping transferring most files.  Maybe this situation is too rare to make that worthwhile.  I will be looking in to creating a daily listing of what files are on which drives.

Link to comment
2 minutes ago, bobobeastie said:

It occurred to me a good solution would be to have a script move lost+found files around using either a list of file locations saved on a cron, or from the old disk7, therefor skipping transferring most files.  Maybe this situation is too rare to make that worthwhile.  I will be looking in to creating a daily listing of what files are on which drives.

If you actually examined lost+found you would realize how hopeless that is. Typically, files get put there because it doesn't know which folder they belong in, or even what the name of the file is.

 

lost+found is usually just full of numbered folders of numbered files or similar.

 

Link to comment
Just now, trurl said:

If you actually examined lost+found you would realize how hopeless that is. Typically, files get put there because it doesn't know which folder they belong in, or even what the name of the file is.

 

lost+found is usually just full of numbered folders of numbered files or similar.

 

I did look at a couple randomly, the files I looked at had their correct names, I think I even had directories not in the root position that had their correct names.  It was the prospect of re-organizing hundreds of thousands of files that made me choose to delete and transfer from the old disk.But I totally understand if that's not normally what happens, and I was going to say file comparing might be an option in cases where the names were lost, but take too long, but files being split makes that likely impossible. Plus in the case of only a directory list you can't file compare.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...