Migrating 64 TB's of material from Windows NTFS to Unraid


RAP2

Recommended Posts

Been looking at some UNRAID videos - and it seems that you cannot just ingest a drive into UNRAID... you have to copy files from a source NTFS drive, to a destination UNRAID drive that is already formatted and part of UNRAID.  

 

Is that accurate as a basic overview?

 

Does anyone have a video on how best set up a UNRAID system for migration of a large amount of materials?

 

One factor, is that I don't have two servers, I don't have double the drive capacity. So I guess its set-up a parity drive and a new data drive and then one drive at a time, copy files over... then take the latest ingested drive, add it to UNRAID, reformat for XFS and repeat... for probably a week?

 

(I am not specifying specific operational requirements in UNRAID - just trying to assess the overall approach.

 

Rob

Link to comment
Quote

Been looking at some UNRAID videos - and it seems that you cannot just ingest a drive into UNRAID... you have to copy files from a source NTFS drive, to a destination UNRAID drive that is already formatted and part of UNRAID.  

 

Is that accurate as a basic overview?

Yes

 

Easily move, copy and sync files to unRAID, within unRAID and from unRAID using Krusader

 

You need two empty HDDs. You build them in as Arry Disks. (Formatting, etc.)

You leave out parity and cache until all data is on the arry

Procedure:

1. Attach the first data HDD to the server - unassigned device - (SAS / SATA)

Copy everything to the array.

Then install the data disk, add it to the array (will be formatted!)

You now have 3 HDDs in the array.

2. Attach the second data HDD to the server - unassigned device - (SAS / SATA)

Then install the data disk, add it to the array (will be formatted!)

You repeat this until all of your data is on the array.

When all HDDs are in the array at the end - install the parity disk and add it to the array as parity (this should then be your last former data HDD and the largest of your HDDs)

Then install cache.


Advantage: it is faster than over the network and you can gradually add your old disks to the array.

Disadvantage: At the point where you have copied the data from the respective old disk to the arry and then install (and format) this HDD, the data is only on the arry - if something goes wrong, they are gone ...

Link to comment

Unraid, or any RAID for that matter, is not a backup, it's high availability that allows the reconstruction of failed drives up to the limit of redundancy. It cannot protect against data deletion or corruption, user error, malicious acts, etc.

 

You must have backup of any data you don't want to lose.

 

Moving data from drive to drive, and especially the kind of mass migration you are talking about, is especially risky because of the number of things that potentially can go wrong.

 

I highly recommend using this as an opportunity to set up backups, so you can copy data and set up applications at your leisure instead of worrying about accidentally formatting the wrong drive or corrupting data with a bum cable that you don't find until after the source files are deleted.

Link to comment

@JonathanM

I absolutely agree.


However, RAP2 asks for a possibility to get the data on the array - because he doesn't want to buy any new HDDs ... (with 65TB space with the current prices, it is not exactly cheap either, at least here in Germany).

A few days ago I was faced with exactly the same situation (with 45 TB of data) and decided to buy 4x 18 TB HDDs. Not because the data is so important or because I need it for business - but I've spent a lot of time over the years to bring it all together.

 

So if RAP2 wants to implement his approach he has to live with the risk.

Link to comment

@ Fuxdom - thanks - that is exactly the case.  I'm not some data-centre that has budgets for this kind of thing.  Not here in Canada! 🙂

 

Please review, if this is possible: 

  1. Start with a new parity and one new data drive.  Turn parity off. 
  2. Copy all files from my first NTFS drive to the first XFS data drive.  Turn on parity.  Ensure that I have a secure system. 
  3. Turn parity off.   Take the first (copied) NTFS drive and format to XFS and add to UNRAID. 
  4. Repeat steps 2 and 3 with each drive until complete.  (8 x 8TB drives)

This is the only way I can think of to ensure - before formatting my data drives in UNRAID - that I have a some amount of failure protection.

 

In step 3 - I suppose I can leave parity on and just extend the entire copy process; but what I do not understand is why I would leave parity off for the entire copy and simply "cross my fingers" - and hope?

 

If turning parity on and off between each drive is a non-starter, then I would probably just turn parity on and leave it on during the entire migration; that's the only way I have some redundancy in case of a drive failure.  There is a high probability that I will NOT have a drive failure during the copy:  I use Ironwolf NAS drives and Hard Disk Sentinel to manage health status and routinely do random surface tests.  If I lose the odd file due to some more recent drive issues -  well - not much I can do about that ...and frankly that's why I'm here. 

 

This is also happening on a brand new server build - not some old thing that has a greater chance of failing.  I will also not start migration for a week or so, to make sure the new build is stable.  The server will also be on a relatively a new UPS, since my last one failed a short time ago.

 

Love to hear more comments about the migration process - as I really am a newbie on UNRAID - AND - I am still struggling to actually get a UNRAID USB key to work on my system.  (a separate post under pre-sales support.)

 

Cheers!

 

Link to comment

The only reason to turn parity off would be to speed up the copy process as writing to the array is slower with parity enabled (parity is real-time).  Every time you turn it  on these would then be a lengthy parity build process.

 

You either want it on for the whole process so data is protected, or off so copying is faster but data is not protected until you build parity at the end of the copying.

Link to comment

Yea... that was the logical implication.  I will turn it on and have some net rather than no net. (safety not network)  🙂

 

Thanks all who participated here in my education here.  🙂 

Edited by RAP2
clarification
Link to comment
Quote

....turn parity on and leave it on during the entire migration;

 

That's exactly how I would do it for you.

In my case, the only reason I left out parity was because all was backed up. The reason for doing it without parity was simply the speed. I read somewhere that someone with parity copied about 10 TB onto the array and brewed about 17 hours for it. (I don't know if it's true - I didn't feel like trying it out)

A few more thoughts about the USB stick


Since the USB stick acts as a drive for the operating system, you should use a decent, high-quality stick here.


We have had very good experiences with the Transcend JetFlash 780 - 32 GB (In Germany 21 Euro; Amazon) over the past few years. The smaller version with 16 GB capacity is actually sufficient, but the price difference is marginal and sticks with a higher capacity are designed for larger amounts of read and write cycles. Hence the recommendation to use the 32 GB variant. In addition, Unraid requires a USB stick with a GUID (i.e. a unique serial number) for licensing, which many cheap sticks do not have.

 

 

Link to comment

Thanks Fuxdom...

 

I think I got the UNRAID boot USB sorted... BIOS setting was set to Windows EUFI (under Secure Boot) As soon as I set it to Other OS, all my ports worked and I tested a few sticks -  they all worked too!

 

On boot, GUI was not working (black screen) - but speculation is that is not configured yet in the GUI.  Over the weekend I will web access in and see.  I'm still waiting for some hardware - an LSI SAS 9207-8i.  I've decided to get a 12TB Ironwolf NAS Pro for my parity - and the 8TB version I had already purchased will instead be used for my first data drive to start the migration.  I still have a ton of questions - so I need to pose them over the next 10 days or so, because as soon as hardware arrives I need to start the process.

 

Thanks again and have a great weekend!

Link to comment

 

I don't know what it's like in candada - I get the Seagate 18TB for 355.90 € - on the other hand, the 12TB Ironwolf NAS Pro costs 369.90 € ......

I think the Exos Enterpreis Hdds are great, have in contrast to the WD Pro, a better, lower failure probability, a higher standing and are even cheaper here in Germany.

(I've been using several 8 TB HDDs for a few years and have never had any problems)

I do not understand why in the forun we get so often to the WD NAS HDDs and you hardly read what from the Exos.

Compare the Datashets...

 

I wish you every success with the LSI SAS 9207-8i. Last year I bought a new SAS / SATA RAID controller PCI Express host bus adapter LSI 9211-8I, LSI SAS2008 chip, 8-port 6Gb / s (was not intended for an Unraid NAS).

It came with stone old IR firmware / BIOS - I am currently trying to flash it to IT [x] mode with the new BIOS / firmware - mainly because it takes 3-4 minutes to boot.

So far it has not worked ...

Do you have SAS HDDs? And why a RAID controller card - is completely unnecessary in Unraid.

Something like that would make more sense (and at least now also cheaper):

https://www.amazon.de/gp/product/B08F56WKW7/ref=ox_sc_saved_title_7?smid=A21CTMDLFXECIV&th=1

 

 

[Seagate Exos X18 Enterprise, 18 TB HDD, CMR 3.5 Zoll, Hyperscale SATA 6 Gb/s, 7.200 U/min, 512e, 4Kn FastFormat, geringe Latenz mit verbessertem Caching, Modellnr.: ST18000NM000J]

 

 

[x]

IR firmware versus IT firmware

When the HBA is delivered, the IR firmware or an integrated RAID firmware is usually installed. This allows the use of the raid controller with its modes such as Raid0, Raid1, Raid5, Raid6, Raid10 or e.g. Raid50.

The IT firmware (initiator target) works differently. It acts as a pure controller in passthrough mode. There is no extra raid layer. LSI refers to this firmware as follows: “IT firmware maximizes the connectivity and performance aspects of an HBA”. And that is exactly the reason why these cards are flashed in IT mode. MDADM, ZFS, or other types of software raid are intended to be used. The raid is controlled by the software of the operating system and not that of the card.

 

Link to comment
  • 2 weeks later...

Hmmm... the  LSI SAS 9207-8i was defined as compatible OOTB - and it is not RAID - it shows as HBA controller and by default in IT mode.  In fact it was advertised as such.  See here:

Hardware Compatibility - Unraid | Docs

 

Anyway, the LSI 8-port controller - all used BTW - was about the same price as the MZHOU 6-port controller from your Amazon link.  The board came with 2x 4-port SAS to SATA cables for around $100 CAD.

 

I also was waiting for an Intel dual-port server NIC.  Finally, I ended up getting a WD Gold 10TB drive that was on sale.  I will use this for my parity drive.

 

BIOS/BOOT:
I had to fuss with the BIOS again to get the UNRAID to boot.  (For some reason it did not keeping the setting for the USB flash drive as the primary boot device.)  It seems my BIOS does not store the USB device as the primary.  I want it saved this way so that if I yank the UNRAID USB device - it will boot from a Windows 10 drive automatically - but if the UNRAID USB is inserted, it does that since it is primary.  Is this not possible without going into BIOS each time?

 

PARITY:

Easy-peazy to identify and select my new WD Gold 10TB drive - its the only one I have.

 

CACHE:
In the UNRAID MAIN page I do not see a place to set the CACHE drives - that are in various videos and help pages.  Does this need to be activated somewhere first? 

 

DATA DRIVES - IDENTIFYING WINDOWS DRIVES:

I have no way of identifying the drives with what UNRAID tells me.  I need to sort out how to do this so I can properly select the correct (BLANK) data drive to add to the array -  and then copy files from my original Windows NFS drive to the first data drive .   (Then turn on Parity on to protect that drive.)   I took snapshots of my Windows Disk Manager and Hard Disc Sentinel reports - but neither shows a unique drive identifier - and of course I cannot see Windows names.  I'm really wondering if UNRAID is for Linux guys - and I am in a Windows centric network - or  - its really set-up to start from scratch and "who cares" what drives one selects.

 

All the on-line examples seem to be based on starting an UNRAID system from scratch.  I'm sure this happens, but it seems to me that most folks are NOT doing that.  (Certainly most that comes from existing Windows systems.)

 

Looking for ideas on how to deal with this last item will allow me to actually start building my array - then create shares to test on my Windows network.  (After reading all about the SMB issues, I have to admit, I'm getting worried.)  How do I do this?

 

 

My evaluation is based on setting up one parity drive and one data drive and a cache - then test shares on Windows - before I I fully commit to UNRAID.  Certainly, before I transfer the other 56 TBs of media, one drive at a time.

 

Thanks again all...

 

Link to comment
17 minutes ago, RAP2 said:

have no way of identifying the drives with what UNRAID tells me.  I need to sort out how to do this so I can properly select the correct (BLANK) data drive to add to the array -  and then copy files from my original Windows NFS drive to the first data drive .   (Then turn on Parity on to protect that drive.)   I took snapshots of my Windows Disk Manager and Hard Disc Sentinel reports - but neither shows a unique drive identifier - and of course I cannot see Windows names.  I'm really wondering if UNRAID is for Linux guys - and I am in a Windows centric network - or  - its really set-up to start from scratch and "who cares" what drives one selects.

 Unraid identifies drives by their serial numbers.   I just write the last four digits of the number of each drive.  That way, I know exactly what drive I am assigning. 

Link to comment
58 minutes ago, RAP2 said:

CACHE:
In the UNRAID MAIN page I do not see a place to set the CACHE drives - that are in various videos and help pages.  Does this need to be activated somewhere first? 

Cache is just a special case of using a pool so use the option to add a new pool.  You can call it anything you like.  You then configure whether shares should any particular pool for caching purposes at the individual share level.

 

Some of the confusion may have arisen from the fact that prior to the 6.9.0 releases there was only one pool allowed and its name was always 'cache'.

Link to comment

@ itempi:  That makes sense. Thanks!

 

@ Frank1940:  Yea - Except, I need to identify the drives set up in Windows Disk Management with how the drives are presented in UNRAID; and that does not seem straight forward.  I see no way of doing this from Disk Management or indeed Device Properties.  SN's are not available there.  If I try to run the CMD:  "wmic diskdrive get model,name,serialnumber" - the serial numbers are there - but the physical drive numbers... are these the "Disk" numbers shown in Disk Manager?  They do not seem to match the disk #'s in Disk Management. 

 

I'm hoping I don't need to disconnect all drives and attach them one at a time, reboot and drill down.... 

Link to comment
1 hour ago, RAP2 said:

Except, I need to identify the drives set up in Windows Disk Management with how the drives are presented in UNRAID; and that does not seem straight forward. 

 

This sentence is confusing to me.  Where are the drives physically at?  In your Windows computer or on the Linux server?

 

In Unraid, you will find this displayed on the Main tab:

image.png.8424441d3aba5ac1ed65f7d9d6d29b2a.png

 

The circled red portion is the logical designation that Unraid uses to identify the position of a disk in the array.  The green circle is the serial number of the actual physical drive assigned.  The black circled entry is assigned by the Linux OS as it boots up.  (I believe is is done in order of detection...)  (This black circled designation would be used if you want access to the disk from the Linux OS command line.)

 

One more thing.  You can not simply assign a disk from a Windows computer to the array and expect its data to be readable.  Unraid needs to first format that disk (thus erasing all of the data) using one of these three file systems-- XFS, BTRFS or ReiserFS--- before it is added to the array.  You must copy the data from the Windows disk (mostly formatted with  NTFS) to a drive previously formatted in one of these three formats. 

 

I believe that the Unassigned Devices plugin will read data from NTFS disks. 

 

It is also possible to mount SMB shares with Unassigned Devices and copy them across the network.  

Link to comment

I disconnected all but my new parity drive and a new 8TB data drive.  I added the two drives and started the array.  The Parity drive is building,  The data drive - brand new but originally formatted for NTFS says "unmountable" - so I formatted it; now it is XFS.  Same deal with the Cache NVMe.  Except it is saying the FS is btrfs?

 

So now I need to copy my first data drive - which is an NTFS volume inside the same machine.  I installed "Unassigned Devices" - so I can mount and view the NTFS drive contents...  but there is no obvious way to copy files.  Do I have to create shares first?

Link to comment

@ Frank1940:  My drives are all in what is now a Windows 10 machine.  It used to be a Windows 2008 Server.  

 

I am currently running UNRAID on this machine.  At the moment, I have a new 10TB parity drive and a new 8TB data drive as Disk 1) - in my (started) array.  Parity is being built - the status bar is stating:  Parity-Sync / Data-Rebuild 31.0%  So its probably about 6 hours away.

 

Yes - its why I installed Unassigned Devices; in hopes of facilitating copying files from NTFS volumes.  When I mount the drive, I can see the partitions and when I click on them, I see the directories and the files; but no indication on how to copy files.

 

 

Edited by RAP2
typo
Link to comment

@ Frank1940: Just circling back to your earlier comments.  I did not have a problem seeing the drive model and S/N in UNRAID.  The problem I had was in Windows -  so that I could make sure I was adding the correct drive to the array.  In the end - and just to get started - I unplugged all my drives, except the 2 new ones.  (Parity and Data) That way I could not accidently add the wrong drives.  Once I did that, I added my first NTFS data drive that needs to be copied to the new XFS data drive.  That's where I'm stuck.  I can mount and view the drive but how do I copy it?  

 

Seems to me it should load some kind of explorer window that lets you see the mounted NFS drive - plus any drive available in the array - like those old school explorer windows.  File copying - IMHO - should be easy and intuitive.  Its 2021 after all; I can do this on my phone.

 

More:

- Been doing some reading and now I'm concerned about not being able to keep the files structures created on my Windows machines.

- I've also noticed that one of my Windows partitions had an exclamation mark in it.  Its been replaced by an underscore: "_"  Is this a Linux thing?
- The Parity drive has an orange exclamation mark saying "Parity is invalid"  I'm assuming this goes away once the build is complete?

Link to comment

I've been re-reviewing the Spaceinvader One Krusader video - to "easily" copy files... hmmmm.  The video tutorial does not represent what I see on my system.  Example - I have no "Unassigned" and "UNRAID" folders... not can I navigate to a mounted NTFS drive.  It also seems like you need to create shares before you copy files or even have a directory.  Is this an old plug-in?

 

Why does it feel like I'm going backwards to pre-Windows 1.0 DOS explorers?

Link to comment

There is also the    mc    File manager that can be run using the built in Terminal program.    If you need assistance in using it, google   linux mc tutorial  

 

I, personally, have never had occasion to use the Unassigned Devices plugin but there is a support thread for it.   

 

       https://forums.unraid.net/topic/92462-unassigned-devices-managing-disk-drives-and-remote-shares-outside-of-the-unraid-array/

 

Remember it is just a tool for mounting devices so that you can access them from Unraid.    What I don't know it where it mounts the devices on the Linux file system.  I will ping @dlandon who is the developer of the plugin for you. 

 

What you could do to setup an Unraid share(s) for the various share(s) on your Win10 server.  Then use Windows Explorer to copy the files across to the Unraid server.   This would be as fast as doing the copy via an SMB share mounted using Assigned Devices.   (Unassigned Devices can be used to attach actual physical drives on the Unraid server for whatever reason that the user may want to use them for. That connection could be either SATA or USB.  (Of course, this would mean that you would have to remove the drives from the Windows server...)

Link to comment

Thanks for the UD forum link!  That was perfect.

 

I've managed to start some copy processes with Krusader:  3 top-level folders from the mounted NTFS data drive to my one XFS data drive.  OMG - slow as molasses; after 2 hours, I have 17 GB of 7 TB's copied. It says it going to take 20 days to do that!

 

I wonder if I should pause the Parity build and restart it after the copy?

 

Link to comment

Regarding copying files from explorer on Windows 10 machine to UNRAID.  The UNRAID machine is the Windows 10 machine.  When I rebuilt the 2008 Server machine I switched to Windows 10 and got all my shares working on my home network.  Now I am trying to migrate everything to UNRAID - but its the same physical machine. 

 

I plan to create a Windows 10 VM once I test the first data drive.  I want to make sure that the 3 shares this data drive provides, are working on all the Windows and Android network devices first.  Hopefully, I can then run Windows 10 in a VM while URAID is operating.  But for now, obviously, I do not have access to this Windows 10 installation; nor any  of my data.

 

I have a back-up of this one data drive - and I'm just doing a copy in any case; so should I pause the parity, hope that the copy happens faster and then start it again after its done?

Link to comment
8 hours ago, RAP2 said:

 

I wonder if I should pause the Parity build and restart it after the copy?

You certainly do not want a parity build running concurrently with a large copy as that will cause drive contention and is likely to take longer that running them one after the other.

 

Many people leave the parity disk unassigned during the initial data load as the copying is much faster without a parity disk assigned.  However your array it not protected against a drive failure until parity is built to it a risk trade-off that only you can make as to what best suits you. 

 

Link to comment

I pressed "Paused" on the Parity last night.  This morning one of the three copies are finished, one is 2 days away and the other is 5 days away.  This is with one 8 TB drive with about 7TB's total of actual data.   Krusader is telling me I am getting <6MiB/s.  That seems ridiculously slow.

 

Once the copy is finished I will turn on the parity and have it completed.  Then I will install Windows 10 in a VM - to replace the one that is on my standard boot drove for that machine.   Then all my windows shares for my network...

 

Once I test everything, I still have another seven 8 TB drives (56 TB's) to copy.  Based on the first drive - and if the Krusader ETA's are accurate - it will take 35 days just for the data copy. (5 days x 7 drives).  This does not include the parity process that I will turn back on after each drive so that everything is protect.

 

However, at least everything is protected and I will have access to the data that has been copied.  

Link to comment

It is difficult to tell exactly how you are setting up your copies. 

 

However, one thing to avoid is setting up two separate copy operations that use the same hard disk on either end of the operation.  IF you do it will introduce delays caused by mechanical head movement and disk rotational latency.  Another source of delay is transfers of small files.  Large numbers of small files require a a lot of file overhead operations that materially affect transfer speeds once the onboard RAM cache is filled.  Remember that using the network as it is a prime bottle neck because of its speed.  (I have observed this effect myself when I do the monthly data backup of the computers in my home.  I generally start all of the computer to doing their backups at the same time and walk away and get breakfast.  I do it this way because I don't want to sit waiting for one to finish before starting the next one-- only talking a few GB of data here on each computer.)

 

Another possible source of slow transfers can be the use of SMR disks rather than CMR disks.  (I know this is a source of debate but theoretically SMR write speeds are slower than CMR.) 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.