Jump to content

Migrating to unRAID - Questions to Ease my Anxiety


manofoz
Go to solution Solved by manofoz,

Recommended Posts

Hello!

 

I am building a new server for unRAID with plenty of storage and I will be starting off by using it as a Plex server. I will be using whatever the latest version of unraid is when I purchase it next week. I tried to make this as brief as possible and only ask the questions that are keeping me up at night. I made a longer post on Reddit, going into more detail, and didn't get much of a response:

 

 

Relevant information about the server:

 

Unraid Version: 6.12.4 (unless a new one comes out before next week)

 

Disks:

  • HDD: 4 x 20TB, 1 x 4 TB. 2 x 6 TB, 1 x 8 TB, 1 x 14 TB + misc others I am purchasing from my friend next week. 
  • SSD: 4x 2TB M.2

 

Media:

  • 22 TB of Movies
  • 3 TB of TV Shows
  • 1 TB of Photos / Home Video (Camera Backup)

 

I will be migrating the data by plugging the occupied drives in via USB 3.0 and using the unassigned drives feature to copy files to the array. Once all are copied I will open the case and install the drives and then clean and format them so they can be added into the main array. 

 

1. Should I be using 1 or 2 disks for Parity?

 

My research tells me that you can't tell exactly which disk failed with only one because you won't know if it was in the array or the parity disk itself. I think this is enough motivation to use 2 but that would be 2x20TB disks, lot of storage but no worry about replacing parity as I add/replace disks. 

 

2. Should my Parity drives be plugged directly into my motherboards SATA ports or is the HBA fine?

 

In my research I see people preferring to plug Parity drives into the mobo and use HBA for the rest. I can go either way.

 

2. Should I pre-organize my data before the transfer?

 

My organizational structure per disk right now, no RAID, starting with Media/<Movies,Photos,TV Shows>/ and the rest is a bit messy. I was thinking for the new server I will organize the media by resolution so Media/<Movies,Photos,TV Shows>/<4kHDR, 4kDV, 1080p, 720p, PooPoo> but maybe that will be simplified to SHARE/Resolution/ in a following question. 

 

3. Should each video file be in it's own folder?


I don't do this right now. I see people saying they do this. I don't know why you would do this. If it's a good idea I will need to write a script to copy all the files into folders with the same name which frightens me a bit. My media is not backed up and moving 30TBs of files with a script seems scary. 

 

4. How much do I need to worry about User Shares and Split Level configurations?

 

I read the manual on this and since I don't yet have the server built I'm lacking some hands on learning. I understand a User Share as a logical drive which you can configure to map to specific drives and put files on those drive using split levels. People seem to either create one Media folder on the array or have multiple Shares for each Media type. They say they can better configure each share depending on the type of media. I guess this makes sense for permissions and if you didn't want everything on every disk. I feel like for my use case I can just trust unRAID to manage and balance the media across drives but I don't want to have to move things around down the road if I'm mistaken here. 

 

5. Will using Krusader to copy files unbalance the drives? Should I use Midnight Commander instead?

 

I'm familiar with a linux command prompt from my job but for a transfer this large I'd rather use something with a UI. However I have read that Krusader creates little folders for everything it plans to copy first and then does the copy which throws off any balancing or split level settings (such as the high watermark setting). It sounded like you could transfer from Krusader in batches to avoid this but I'm not sure, it's a confusing problem. I could also transfer over the network but two of my drives are already external and using unassigned drives cuts the risk of a computer crashing mid transfer in half vs. the network (I have a 2.5 Gbps wired network so it would not bottleneck).

 

6. Is having two 2x2TB Raid 1 Cache Pools a bad idea?

 

I want a 2TB (raid 1) cache which allows high speed transfer over the network and then copies to the array at night or when close to full. I also want a 2TB (raid 1) cache that Plex and other containers can store configuration and appsettings on. Also if I create a VM I'd use this to give it some storage. I am going with Raid 1 just so they aren't single points of failures. My mobo supports 4 M.2 drives and they are cheap right now so I figured I'd get 4 of them.

 

Thank you for the help! Hopefully I can change my name on the forum after making this post... 

Link to comment

Plex documentation does recommend movies in their own folders. That's going to be a bit of a headache. Plus my file paths will get a bit longer but still shouldn't be anything Windows couldn't handle. 

 

Quote

Movies in Their Own Folders

Movie files can be placed into individual folders and this is recommended, as it can (sometimes significantly) increase the speed of scanning in new media. If you have external media for a movie (e.g. custom poster, external subtitle files, etc.) you should usually place the movie in a nested folder along with the custom media files. Name the folder the same as the movie file:

 

Now I'm thinking maybe I should put each version of the movie in the movie folder, so have  "- 1080p" and a " - 4k" appended to the name of the files after the year. I could further break it up by having the a folder per letter like "A", "B", "C" so that the number of subfolders doesn't grow too large. Maybe that isn't necessary. 

Edited by manofoz
Link to comment

I'll start with the first question and see if others want to chime in.

1 hour ago, manofoz said:

1. Should I be using 1 or 2 disks for Parity?

 

 

1 hour ago, manofoz said:

HDD: 4 x 20TB, 1 x 4 TB. 2 x 6 TB, 1 x 8 TB, 1 x 14 TB + misc others I am purchasing from my friend next week.

This is apparently not a complete drive count so not going to be able to give a considered opinion on the first question.

 

But you have the "motivation" for dual parity all wrong. You will know which drive gets disabled, because Unraid will disable it when a write to it fails. That failed write makes it out-of-sync with the array because parity gets updated anyway so the failed write can be recovered. This works the same whether you have 1 or 2 parity.

 

The situation where there might be some uncertainty about which drive is the cause is when you have sync errors when doing a parity check. And dual parity won't clear that situation up either.

 

Dual parity begins to make sense the more data disks you have.

 

Also, since you mention getting some disks that have apparently already had some use, you need to be sure to test each disk before using it in the parity array. Parity by itself can recover nothing. Parity is just an extra bit that allows a missing bit to be calculated from all the other bits (this is how parity works in any system, whether Unraid or something else). All disks must be reliably read to reliably rebuild a failed disk.

 

https://docs.unraid.net/unraid-os/manual/what-is-unraid/#parity-protected-array

 

Link to comment
26 minutes ago, manofoz said:

Plex documentation does recommend

Do you have any experience with Plex? If not, you might try setting it up on Windows or wherever you already have all your media just to get a better idea. The application is basically the same thing whether it is running on Windows or in a docker container on Unraid.

Link to comment
2 hours ago, manofoz said:

6. Is having two 2x2TB Raid 1 Cache Pools a bad idea?

Not a bad idea at all. I have 2 pools, each used differently just in the way you are considering. With 2TB though, it might not be necessary. After the initial data load, you probably won't be filling cache daily.

 

And you should definitely not cache the initial data load. It is impossible to move from cache to array as fast as you can write to cache. If you intend to write more at one time than cache can hold, don't cache.

Link to comment
5 minutes ago, trurl said:

Do you have any experience with Plex? If not, you might try setting it up on Windows or wherever you already have all your media just to get a better idea. The application is basically the same thing whether it is running on Windows or in a docker container on Unraid.

Yes I have been using Plex for 6 years without a problem. I use it locally had have 7 remote users who are close friends and family who also use the server. The problem lies with how my media is currently organized. I have 5 disks and each have Media/Moves and Media/TV Shows folders and in those folders is just a dump of media. It works fine but I will be moving to a RAID array and will now have the problem 1080p media collides with 4k media etc. TV Shows won't be a problem, I don't have multiple formats there yet but movies will need some pre-organizing before I migrate them to the array. I was thinking going with a folder per movie and then "- 1080p", "- 4k" etc appended before the file extension for different formats. This will be either labor intensive or require writing a script to handle so I want to get it right before I take the time to do so.  

Link to comment
1 minute ago, manofoz said:

moving to a RAID array

Just so there is no misunderstanding. The Unraid parity array is not a traditional RAID implementation, it is named Unraid for a reason. There is no striping in Unraid, each file exists on a single disk. User shares allow folders to span disks, but each file is on a single disk.

Link to comment
15 minutes ago, trurl said:

I'll start with the first question and see if others want to chime in.

 

 

This is apparently not a complete drive count so not going to be able to give a considered opinion on the first question.

 

But you have the "motivation" for dual parity all wrong. You will know which drive gets disabled, because Unraid will disable it when a write to it fails. That failed write makes it out-of-sync with the array because parity gets updated anyway so the failed write can be recovered. This works the same whether you have 1 or 2 parity.

 

The situation where there might be some uncertainty about which drive is the cause is when you have sync errors when doing a parity check. And dual parity won't clear that situation up either.

 

Dual parity begins to make sense the more data disks you have.

 

Also, since you mention getting some disks that have apparently already had some use, you need to be sure to test each disk before using it in the parity array. Parity by itself can recover nothing. Parity is just an extra bit that allows a missing bit to be calculated from all the other bits (this is how parity works in any system, whether Unraid or something else). All disks must be reliably read to reliably rebuild a failed disk.

 

https://docs.unraid.net/unraid-os/manual/what-is-unraid/#parity-protected-array

 

I had read that article a while back but it's a good refresher. Every disk except for the 4x20TB will not be new. I wanted to get a large Parity drive so I didn't have to worry about it constraining future expansion. I will have 9 drives for certain and the used ones there are running in my current Plex server and work file. Two are external because that server ran out of space (it's 14 years old running a first gen i7 975 processor that can't HW transcode). I don't know how many drives my friend has, probably around 5 since I don't want any below 4TB. I read 2 parity allows 2 drives to fail before you are screwed, not sure if 14 drives - parity ones is enough to justify that. I guess I can always convert a second 20 TB drive to parity if I learn that I need it.

 

Is it easy to use unRAID to test the drives before installing them? I can plug them in via USB as unassigned drives if that is helpful. I can also plug them in via USB to a Windows machine and just see that they show up and can be written to. 

Link to comment

Each user share has settings that control how it uses the array and pools. The advantage to having different types of files in different user shares is for these different settings.

 

And since I have already started talking about User Shares

2 hours ago, manofoz said:

4. How much do I need to worry about User Shares and Split Level configurations?

I personally don't bother with split level. It made more sense back in the day of smaller disks, when it might have been necessary to spin up another drive to get the next file when playing a multipart movie, or a music album. With large disks, if you keep plenty of free space on each, and use default High Water allocation, most things that belong together will get written together to the same disk. And the only penalty if something is on a different disk is a spinup delay.

  • Upvote 2
Link to comment
3 minutes ago, manofoz said:

test the drives

Unassigned Devices and Preclear plugins.

 

I think dual parity makes sense for the number of drives you intend, but you might also consider not adding drives until you need them for capacity. Each additional disk requires more hardware, more power. And perhaps more importantly, each additional disk is an additional point of failure.

  • Upvote 1
Link to comment
12 minutes ago, trurl said:

Yes I believe I am going to be writing a script to move all my Movies into folders with the same name. I can also have the same script append the resolution onto the file it places into the folder. However because my movies of different resolution are on different drives this solution will still create conflicts if I try to mass migrate. The only solution there would be to recursive search all media folders and match files with the same name. Then make one folder with that name, use my lookup table to find the resolution of each, and move them into the same folder. However one will be moved from one drive to another. I may not have enough space on any given drive to consolidate all of the files. Worth a shot though that would make the final migration way easier. I will start plugging away at a move/rename script that parses my path to resolution file I exported from Tautulli's API. I could even link the code that calls the API to the move/rename script but it's probably faster to just use an offline dump of the data from Tautulli's

Link to comment
20 minutes ago, trurl said:

Shouldn't matter if all controllers are working well. Marvell controllers are NOT recommended, and RAID controllers should be avoided with Unraid.

Good to know, thank you! I will probably just use the HBA as the cables are a bit smaller. I picked up this guy https://www.amazon.com/dp/B0BVVDT4F1?psc=1&ref=ppx_yo2ov_dt_b_product_details after reading this:

 

Link to comment
27 minutes ago, trurl said:

Unassigned Devices and Preclear plugins.

 

I think dual parity makes sense for the number of drives you intend, but you might also consider not adding drives until you need them for capacity. Each additional disk requires more hardware, more power. And perhaps more importantly, each additional disk is an additional point of failure.

Thanks! That is a good call. Should I always preclear or let Unraid do a Clear and Format when setting up the initial Array and then Preclear when I add additional disks? Reading the documentation it sounded like preclear was optional and unraid would just do a clear if you didn't do that so I didn't really get why it was better than just doing a clear. Also said the array would still be usable while unraid cleared and formatted the new disk so I didn't see a downside that preclear solved. I'm sure I missed something, lot of people mention it. 

 

I should have plenty of capacity with what I have now before getting the mystery disks from my buddy at work. He used them for some crypto thing so they are a bit mysterious but will be a great deal. He's just looking to get rid of them as he stopped doing the crypto thing a while back, it never really net him anything. 

Link to comment

User Shares are simply the combined top level folders on array disks and pools.

 

If you create a User Share in the webUI, Unraid will create a top level folder named for the share on array disks or pools as needed in accordance with the settings for that User Share.

 

Conversely, if you create a top level folder on array disks or pools, it is automatically a user share with default settings, named for the folder.

 

2 hours ago, manofoz said:

5. Will using Krusader to copy files unbalance the drives? Should I use Midnight Commander instead?

You mention Unassigned Devices, and I would recommend that instead of using the network. But the disk write speed will be the bottleneck in any case unless you don't install parity until after the initial data load.

 

You can also use Dynamix File Manager instead of Krusader.

 

As for "balancing" the initial data load, there are ways to manipulate the settings of a User Share to make a batch of files go to a specific disk.

 

User Share settings are mostly about writing new files. If you Include a specific disk in a User Share, then all other disks are excluded, and new files will only be written to that disk. But any files for the share on other disks are still included for reading the share.

 

Also,

1 hour ago, trurl said:

you should definitely not cache the initial data load. It is impossible to move from cache to array as fast as you can write to cache. If you intend to write more at one time than cache can hold, don't cache.

 

Link to comment

Unraid only requires a clear disk when adding it to a new data slot in an array that already has valid parity. This is so parity will remain valid, since a clear disk is all zeros, and zero has no effect on parity. If a disk hasn't been precleared, Unraid will clear it in this scenario.

 

Older versions of Unraid (many years ago) took the array offline while it cleared a disk, so preclear was invented. But the array stays online when clearing now, so the only purpose of preclear is to give a disk a good test. Some preclear a disk for one or more passes to get the disk past "infant mortality". I often say if electronics doesn't fail early it will probably be obsolete before it does.

 

For disks that have been working well, an Extended SMART Self Test should be good enough. You can do that in Unassigned Devices before adding it to the array.

 

Unraid monitors certain SMART attributes and will give a warning if those indicate a problem. For WD disks, you should also have it monitor SMART attributes 1 and 200 (click on the disk to get to its settings).

 

Be sure to setup Notifications to alert you immediately by email or other agent as soon as a problem is detected. Don't allow one ignored problem to become multiple problems and data loss.

 

Parity is not a backup. You must always have another copy of anything important and irreplaceable. You get to decide what qualifies.

Link to comment
41 minutes ago, trurl said:

User Shares are simply the combined top level folders on array disks and pools.

 

If you create a User Share in the webUI, Unraid will create a top level folder named for the share on array disks or pools as needed in accordance with the settings for that User Share.

 

Conversely, if you create a top level folder on array disks or pools, it is automatically a user share with default settings, named for the folder.

 

You mention Unassigned Devices, and I would recommend that instead of using the network. But the disk write speed will be the bottleneck in any case unless you don't install parity until after the initial data load.

 

You can also use Dynamix File Manager instead of Krusader.

 

As for "balancing" the initial data load, there are ways to manipulate the settings of a User Share to make a batch of files go to a specific disk.

 

User Share settings are mostly about writing new files. If you Include a specific disk in a User Share, then all other disks are excluded, and new files will only be written to that disk. But any files for the share on other disks are still included for reading the share.

 

Also,

 

Thanks for the heads up. I will research how to disable the cache before doing the initial migration of data and then enable it once things are all settled. I was under the mindset that the cache filling up would be handled gracefully and things would just go right to the array after. 

 

I don't have any reason to write certain files to certain disks so I don't think I will configure that. However it may still be worth using a user share to partition each type of data. There are some things I can imagine this to be valuable for, for example I'd like more control over who can access my photos than who can access movies & TV shows. I will explore these more, could these be created after the fact or is this something that will be a pain if I don't decide upfront? 

 

Not sure I have any reason to disable Parity while doing the initial migration. I was going to set up all my empty drives first and have parity in place. I don't mind waiting the extra time it would put on the transfer. 

 

Finally, I will check out Dynamix File Manager. Have not heard that mentioned but I'd like to be able to queue up large transfers and let it run overnight. 

Link to comment
1 hour ago, trurl said:

Unassigned Devices and Preclear plugins.

 

I think dual parity makes sense for the number of drives you intend, but you might also consider not adding drives until you need them for capacity. Each additional disk requires more hardware, more power. And perhaps more importantly, each additional disk is an additional point of failure.

 

54 minutes ago, manofoz said:

Thanks! That is a good call. Should I always preclear or let Unraid do a Clear and Format when setting up the initial Array and then Preclear when I add additional disks? Reading the documentation it sounded like preclear was optional and unraid would just do a clear if you didn't do that so I didn't really get why it was better than just doing a clear. Also said the array would still be usable while unraid cleared and formatted the new disk so I didn't see a downside that preclear solved. I'm sure I missed something, lot of people mention it. 

 

I should have plenty of capacity with what I have now before getting the mystery disks from my buddy at work. He used them for some crypto thing so they are a bit mysterious but will be a great deal. He's just looking to get rid of them as he stopped doing the crypto thing a while back, it never really net him anything. 

 

If you want a bit of reasoning for why duel parity can make sense as the number of drives rises, see this thread:

 

     https://forums.unraid.net/topic/50504-dual-or-single-parity-its-your-choice/

 

Modern HD's have a even lower failure rate compared to those from 2016.   I would definitely follow @trurl's suggestion and not add any more drives than you need for current storage.  Put any extra drives in anti-static bags and put them on a shelve until you need to add them.  (I would add all the 20TB drives if they are currently covered under warranty.  (I think a lot of folks have had the same experience that I have had.  If a drive is still working after about three years, you will probably replace it to gain capacity before itself fails.  I had a 1TB drive I recently pulled that was 11 years old!)

 

Note that there are folks who suggest using Dual parity because it prevents issues that can arise if a second drive has issues when rebuilding a drive. 

  • Upvote 1
Link to comment
25 minutes ago, trurl said:

Unraid only requires a clear disk when adding it to a new data slot in an array that already has valid parity. This is so parity will remain valid, since a clear disk is all zeros, and zero has no effect on parity. If a disk hasn't been precleared, Unraid will clear it in this scenario.

 

Older versions of Unraid (many years ago) took the array offline while it cleared a disk, so preclear was invented. But the array stays online when clearing now, so the only purpose of preclear is to give a disk a good test. Some preclear a disk for one or more passes to get the disk past "infant mortality". I often say if electronics doesn't fail early it will probably be obsolete before it does.

 

For disks that have been working well, an Extended SMART Self Test should be good enough. You can do that in Unassigned Devices before adding it to the array.

 

Unraid monitors certain SMART attributes and will give a warning if those indicate a problem. For WD disks, you should also have it monitor SMART attributes 1 and 200 (click on the disk to get to its settings).

 

Be sure to setup Notifications to alert you immediately by email or other agent as soon as a problem is detected. Don't allow one ignored problem to become multiple problems and data loss.

 

Parity is not a backup. You must always have another copy of anything important and irreplaceable. You get to decide what qualifies.

Thanks. I do have some WD disks, I believe my externals are WD, so I will look into monitoring those SMART attributes. My other disks are Seagate but there could be others, I haven't taken inventory in a while. This all makes sense, I will tun that test before adding anything I own to the array and try pre-clear on the mystery drives I may be getting from my friend.

 

I do and will continue to have my photos backed. The rest of my media I have catalogued, a lot of it came from MakeMKV on disks I have in my attic now. That would suck to have to do again but in a way they are also backed up. 

 

I also wrote a program that fetches exports of all library's using Tautulli's RESTFUL API. It then uploads the exports to google drive. I was going to run this nightly just to keep a record of what I have in the could. It's not a backup, much smaller, but would be a good start in the case of a catastrophic failure.

Link to comment
4 hours ago, manofoz said:

Reading the documentation it sounded like preclear was optional and unraid would just do a clear if you didn't do that so I didn't really get why it was better than just doing a clear.

The reason for doing a preclear is that as well as clearing the disk (thus avoiding the need for Unraid to do it) it also acts a stress test for the drive so that you can avoid adding drives that do not pass pre-clear error free.   Pre-clear does take longer than the clear process though.

Link to comment
40 minutes ago, itimpi said:
  4 hours ago, manofoz said:

Reading the documentation it sounded like preclear was optional and unraid would just do a clear if you didn't do that so I didn't really get why it was better than just doing a clear.

 

It was also intended to weed out  hard disks that would experience 'Infant Mortality'.  (Google that term if you are not familiar with it.)  In fact, I seem to recall discussions about whether Preclear should include random head movements rather than a simple track-to-track progressions across the disk patter(s) to further stress/test the disk to uncover another possible failure mode.

 

I can remember when every PC built was 'burned-in' for 72 hours before it was shipped to the customer.  That apparently is no longer done...

Link to comment
2 hours ago, Frank1940 said:

 

It was also intended to weed out  hard disks that would experience 'Infant Mortality'.  (Google that term if you are not familiar with it.)  In fact, I seem to recall discussions about whether Preclear should include random head movements rather than a simple track-to-track progressions across the disk patter(s) to further stress/test the disk to uncover another possible failure mode.

 

I can remember when every PC built was 'burned-in' for 72 hours before it was shipped to the customer.  That apparently is no longer done...

OK sounds like I should definitely preclear the used drives and maybe even the new ones for a "smoke test" before adding them to the array. Does this apply when starting empty? Pre-clear my bran new drives before setting anything up?  

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...