Jump to content
Nick_J

Existing disk showing as 'New Device' after UDMA CRC errors detected during parity swap

14 posts in this topic Last Reply

Recommended Posts

Posted (edited)

Hi everyone,

 

I fear I've lost some data here.

 

Basically I was carrying out the parity swap procedure. I had already copied parity (6tb) to the new larger disk (10tb), and I had started up the array and it had commenced a rebuild of the data disk (2tb) that had been replaced by the old parity disk (6tb).

 

Shortly after the rebuild had commenced, it halted due to read errors being encountered on one of the other disks. I pressed resume, which may not have been the best idea. It then started speeding through at 2-3gb/sec which clearly was incorrect so I stopped the rebuild. There were a few UDMA CRC errors present in the SMART report of that disk.

 

At this stage Unraid was reporting that disk (disk3) was missing. I figured it might have been a cabling issue, so I shutdown the machine and checked all the cabling and powered it back up. Now that disk is present, but is unassigned in Unraid. When I select the disk that was already assigned as disk3 though, a blue icon appears next to it indicating it's a 'New Device'.

 

The array won't start due to 'Too many wrong and/or missing disks!'.

 

Is there a way to 'remind' Unraid that that disk was already existing? Or is it a lost cause and I need to basically start a new array and cope with having lost the data on that disk (the 2tb one which was being rebuilt at least I still can mount outside the array and restore the files I guess)?

 

 

Thanks,

 

Nick

Edited by Nick_J

Share this post


Link to post

Hi Nick,

Unraid has dropped that disk from the array. Although it is now present when the server is powered back on, when you add it as the same drive, disk 3 , although its the original drive because it was dropped from the array, Unraid now considers it a new disk and wants to rebuild it from parity.

However as this disk came out of the array when rebuild was happening (replacing the 2tb)  it cant rebuild disk 3 because there are now 2 missing drives and 1 parity disk cant rebuild 2 failed drives only 1. this is why you see the error 'Too many wrong and/or missing disks!'.

 

Now there maybe a way that you recover from this.

I guess that you still have the 2tb disk that you are replacing yes? (if not there still maybe a way but for now i assume that you have) So if you still have it then  data is still on that drive...so thats good.

Disk 3 should also still have its data on it so you haven't lost anything there (unless it was lost before hand)

 

So what i would try would be the following.

Note the disks and there locations (take a screen shot of how they are now)

Now goto tools, new config.

 

On the drop down menu you will see preserve current assignments.

Choose to preserve parity slots and and cache slots and create new config.

 

Now add back the data drives as in your photo. (be careful not to add any other drives that were not part of the array, if you were using other unassigned drives on the server)

 

But don't add the 2 tb drive to the array (you want that to be an unassigned drive and then later copy the data from that to the array)

So you should now have in the array, all of your old data disks (except the 2tb) including the 6tb that you started to rebuild on.

 

Now start the array. Unraid will want to do a parity sync as this is a new config. Cancel that, you will do that later.

It will also say that the 6tb drive has not file system (or it should as the rebuild failed) this is fine. Allow Unraid to format the 6tb drive.

 

Now the array should be started. parity will not be valid   -- but don't do a parity sync as yet. You need to copy the data off the 2tb drive first.

Do this using the unassigned devices plugin and Krusader docker container you and manually transfer the data off the 2 tb to the 6tb drive.

 

After this has finished  ( now the data is transferred to the 6tb ) you can then do a parity sync and your array should be fine. if the disk 3 is okay and doesnt still keep giving read errors parity sync should complete ok

 

I hope that this makes sense 

 

Share this post


Link to post
Posted (edited)

Hi SpaceInvaderOne,

 

You are an absolute lifesaver! I've followed the process you described, and I can see all my data! Thank you so much!

 

Now the only thing is it did not recognise that 6tb drive as not having a file system. It's quite odd, it mounted the drive and I can see files there but I don't have permission to any of them - even doing an ls fails to get permission detail back, although the files are listed. I'm assuming these are just pointers from the file allocation table or something to files which actually don't exist on the disk... the allocation table might have partially rebuilt during the data rebuild that ended up failing? I'm using ReiserFS still for what it's worth.

 

I'm guessing I should somehow instruct Unraid to reformat this disk, and then once that's done, copy the files back from the old 2tb disk?

 

I've installed Unassigned Devices and Krusader, and I've mounted this 2tb drive over USB (I'm maxed out my SATA ports) using Unassigned Devices.

 

 

Cheers,

 

Nick

Edited by Nick_J
Updated now that I've got the old 2tb drive mounted over USB

Share this post


Link to post

Hey Nick. Glad that you haven't lost your data :)  Yeah your right, the rebuild has started to reconstruct the data onto the 6tb and so has built the indexes to the files but nothing else.

So yeah just reformat the drive and then copy the data across.

If you didn't figure on how to reformat the 6tb then use unassigned devices plugin to delete the partition.

 

1. Goto the settings of the unassigned devices plugin and set it to destructive mode.

2. Stop the array.

3. Set the slot with the 6tb drive to unassigned.

4. the 6tb drive will show up as an unassigned device.

5. delete the partition

6. Now you can  either

     (i) format the drive using unassigned devices to whatever filesystem that you want. ( I would use xfs as that is now the Unraid defualt for array drives). 

     (ii) Only delete the partition then when drive back in array Unraid will format it as the current defualt set for the drives in settings/disk settings.

7. Go back to where you set the 6tb drive slot in the array to be unassigned and add the 6tb disk back to that slot.

8  goto tools/newconfig  and this time set preserve current assignments to be  all and apply

9. start the array . Again cancel parity sync until the data is safety copied across from the 2 tb then do a parity sync.

 

 

Share this post


Link to post

Mate, you are an absolute legend!

 

I've formatted (as XFS - looking at that as a bit of a bonus here!) and have started the file copy from the old 2tb drive. Gonna take quite a while by the look of things, over a day. In hindsight - it might have been faster if I disabled parity altogether as it looks like it's calculating it as it writes even though parity is marked as invalid? But all good, I'm patient.

 

Once all the data is there, I'll kick off the parity sync. Will report back here as it goes!

 

Thanks again SpaceInvaderOne!

Share this post


Link to post
1 hour ago, Nick_J said:

have started the file copy from the old 2tb drive. Gonna take quite a while by the look of things, over a day.

Shouldn't take that long for only 2TB unless you have something wrong or serious bottleneck with controller. If speed doesn't improve post diagnostic.

Share this post


Link to post
4 hours ago, trurl said:

Shouldn't take that long for only 2TB unless you have something wrong or serious bottleneck with controller. If speed doesn't improve post diagnostic.

Yeah thats definitely a long time..... I am guessing that this may be due to op having the drive connected by usb. If his harddrive usb connector is only a usb 2.0 and not 3.0. So if say the usb 2.0 data transfer is about 30mbs then 2TB would take about 21 hours. 

Share this post


Link to post
19 hours ago, trurl said:

Shouldn't take that long for only 2TB unless you have something wrong or serious bottleneck with controller. If speed doesn't improve post diagnostic.

 

So copy completed in less than a day in the end. Average speed was around 25-30mb/sec. The 2tb HDD had 1.8tb of data on it. It was over USB 2.0.

 

However in saying this - my array has always seemed to max out at about 25mb/sec sustained write speed. I get one or two GB in at about 80-90mb/sec, which I assume is Linux writing into memory cache before it fills and has to flush to disk, before it drops down to 25mb/sec or so where it remains indefinitely. I always assumed this was normal, without having a cache disk. Should I start a different thread about that issue with the diagnostic details to see if we can bump up the performance a bit?

 

Regarding the issue at hand, I've kicked off the parity sync and it's proceeding nicely! It's way past where the UDMA CRC errors were hit last time (almost immediately, around the 700mb mark).

Share this post


Link to post
11 minutes ago, Nick_J said:

However in saying this - my array has always seemed to max out at about 25mb/sec sustained write speed. I get one or two GB in at about 80-90mb/sec, which I assume is Linux writing into memory cache before it fills and has to flush to disk, before it drops down to 25mb/sec or so where it remains indefinitely. I always assumed this was normal, without having a cache disk. Should I start a different thread about that issue with the diagnostic details to see if we can bump up the performance a bit?

This slightly on the slow side, but not excessively so.  

 

If you do not mind all your disks spinning you might want to look at what performance you get using Turbo Write mode.   There is an associated plugin to help automate when Turbo Write mode should be used if you do not want to control it manually. 

 

Share this post


Link to post
On 8/20/2019 at 6:59 PM, itimpi said:

This slightly on the slow side, but not excessively so.  

 

If you do not mind all your disks spinning you might want to look at what performance you get using Turbo Write mode.   There is an associated plugin to help automate when Turbo Write mode should be used if you do not want to control it manually. 

 

Very interesting indeed! That reads quite well - I'm going to give that a try after the parity rebuild completes (later tonight) and then the parity check completes probably tomorrow night/next morning.

Share this post


Link to post

The parity check is still running, all looking good though!

 

I just wanted to say - this is exactly why I chose Unraid many years ago. If what happened to me here had have happened on a normal RAID-5 array, chances are I would have lost everything. Even the 2TB drive which had been removed would not contain any useful data because the data would have been striped across the drives.

 

So kudos for a great product!

Share this post


Link to post

Parity check completed with no errors found, so back in business!

 

I also enabled turbo write mode and I'm seeing some much more respectable speeds. The first gig or two I'm getting 106mb/sec (ie pretty much maxing out the 1gbps network!), then seeing about 40mb/sec sustained thereafter.

 

Thanks again for your help @SpaceInvaderOne and @itimpi!!!

Share this post


Link to post

Hi @Nick_J glad that its working.

One quick thing that you can try too is check if write cache is enabled on the drives.

On the webui you will see each disk has an id ie sdb sde sdf etc

For each disk use this command. Example here is for my disk sdb

hdparm -W /dev/sdb

 

1934890986_ScreenShot2019-08-25at09_38_07.thumb.png.0c4fdd985d7730bf1e3e56689c8eb907.png

 

If it doesn't say write-caching on for the drive as above.

You can enable it by using command

hdparm -W 1 /dev/sdb

 

Share this post


Link to post
7 minutes ago, SpaceInvaderOne said:

Hi @Nick_J glad that its working.

One quick thing that you can try too is check if write cache is enabled on the drives.

On the webui you will see each disk has an id ie sdb sde sdf etc

For each disk use this command. Example here is for my disk sdb


hdparm -W /dev/sdb

 

1934890986_ScreenShot2019-08-25at09_38_07.thumb.png.0c4fdd985d7730bf1e3e56689c8eb907.png

 

If it doesn't say write-caching on for the drive as above.

You can enable it by using command


hdparm -W 1 /dev/sdb

 

Yep - I've just checked and write caching is enabled for each disk :).

Share this post


Link to post

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.