Re: Format XFS on replacement drive / Convert from RFS to XFS (discussion only)


Recommended Posts

44 minutes ago, tunetyme said:
index of /mnt/user/disk 3
Type Name Size Location Last Modified
 
Parent Directory      
 
S   Disk 3 2017-01-22 15:05
  1 object: 1 directory, 0 files

That seems to indicate that you have a user share named "disk 3" rather than a user share named "disk3", but I'm not entirely comfortable with the way the webUI displays this information. Can you go to the command line and try this?

ls -lah /mnt/user

Please be careful with upper/lower case and spaces when discussing this since they are critical.

Link to comment
2 hours ago, tunetyme said:

Are you saying when you use rsync and copy a file drive one (2TB rfs to 4TB xfs) to drive two that each bit ends up at the exact same address?

No, I'm saying the location of the individual files doesn't matter. Parity doesn't track changes per file, only per disk address. Parity has no concept of files or formats.

Link to comment

OK, let me see if I can explain my dilemma in a different way.

Using bits

At address x on a 2TB drive RFS (not that the format is essential) =1 and address y = 0

I use rsync to copy the data to a 4TB drive xfs I am assuming that both formats use the same address scheme 

you are saying that address x = 1 and address y = 0

 

I guess I am thinking like Windows where if I copy a file from drive a to drive b it also defragments the file when it is copied..

Are files ever fragmented under Linux?

Link to comment
1 hour ago, tunetyme said:

Are you saying when you use rsync and copy a file drive one (2TB rfs to 4TB xfs) to drive two that each bit ends up at the exact same address?

 

If so, that's amazing and who ever wrote that command didn't have a life for quit awhile.

 

Again, if you are old school like me then "show me". That's why If I do verify parity before starting the process and then substitute the 4TB drive for the 2TB drive in the array I would want to see that parity is still valid. If I do it one time then I will trust it. Remember I started back in the day of punch cards,then Decwriters (tons of greenbar) and finally CRT's.  We were taught to be skeptical...GIGO (garbage in garbage out). I remember when they were trying to get the code rock solid for basic things like a keyboard. It may be that kids these days don't have to contend with this anymore but this was my training.

Lots of "old schoolers" here.

 

Sounds like you understand parity, but you are applying it to wrong set of bits. The exact same address being referred to here refers to the equivalent position on each disk. It has absolutely nothing to do with files, which is the only thing rsync knows about.

 

Don't think about files. Parity is not at all about files. That's why it doesn't matter what the filesystem is, or where files are located. As far as parity is concerned, it is all a bunch of bits.

 

Don't think about the bytes on each disk, think about the bits on all disks. The parity bit at a specific position on the parity disk is calculated based on taking that bit at that same position on each data disk, and adding those bits together, and applying the rule for even parity. (Actually, it just does the XOR of all bits which produces the same result and is more efficient.)

 

This method of parity calculation is similar to what is done with all RAIDs, and even though unRAID is not RAID this is what it does.

 

Parity doesn't contain a parity bit for each byte of each disk. That wouldn't allow it to reconstruct the missing byte from a missing disk since a single parity bit can't tell you much about a missing byte.

 

But a parity bit that corresponds to each bit of all disks allows the bit of any single missing disk to be calculated from the corresponding bits of all the other disks. This is why we say you can only have one missing disk, because all the other disks (not just parity) are required to calculate the missing disks data. Dual parity gives us an additional bit to use so it allows 2 missing disks, but the principle is the same.


Think about this and if you think you finally understand how parity is calculated, reread what I posted earlier and see if it makes more sense.

 

Link to comment
3 minutes ago, trurl said:

Sounds like you understand parity, but you are applying it to wrong set of bits. The exact same address being referred to here refers to the equivalent position on each disk. It has absolutely nothing to do with files, which is the only thing rsync knows about.

 I think in terms of absolute addresses not relative. I need to think about that for awhile.  I guess I need to throw some rocks in the pond and contemplate relativity along with my naval. I guess I'll go hitch up the buggy and go to town...

So rsync doesn't do anything more than copy files. I had the impression that it does much more on a lower level.  

Link to comment
11 minutes ago, tunetyme said:

OK, let me see if I can explain my dilemma in a different way.

Using bits

At address x on a 2TB drive RFS (not that the format is essential) =1 and address y = 0

I use rsync to copy the data to a 4TB drive xfs I am assuming that both formats use the same address scheme 

you are saying that address x = 1 and address y = 0

 

I guess I am thinking like Windows where if I copy a file from drive a to drive b it also defragments the file when it is copied..

Are files ever fragmented under Linux?

Words have apparently not made things less confusing, so I'll give a more practical example a stab. Let's pretend we have three drives, a 4 bit parity disk, a 2 bit data disk (Disk 1) and a 3 bit data disk (Disk 2):

 

We start out with the following raw contents:

Parity: 0110
Disk 1: 01
Disk 2: 001

The we copy a (pretend) one bit file from Disk 1 (second bit) to Disk 3 (different filesystem, ends up at other address on other disk), now we have:

Parity: 1110
Disk 1: 01
Disk 2: 101

Replace the small Disk 1 with a fancy new 4 bit disk and rebuild. Extra space is filled with values calculated using parity disk and the other data disk, so parity remains valid:

Parity: 1110
Disk 1: 0100
Disk 2: 101

 

Link to comment
3 minutes ago, tunetyme said:

 I think in terms of absolute addresses not relative. I need to think about that for awhile.  I guess I need to throw some rocks in the pond and contemplate relativity along with my naval. I guess I'll go hitch up the buggy and go to town...

So rsync doesn't do anything more than copy files. I had the impression that it does much more on a lower level.  

I use rsync all the time to copy between different filesystems, and between different computers using different operating systems. In fact, copying between different computers is pretty much why rsync exists. The differences between filesystems and operating systems can be pretty significant at the lower levels, but at the file level a file is a file.

 

And if you want to get technical, rather than absolute and relative addresses, maybe we should talk about logical addresses. There is a lot going on at different levels of the system, such as drive firmware.

Link to comment
1 minute ago, trurl said:

I use rsync all the time to copy between different filesystems, and between different computers using different operating systems. In fact, copying between different computers is pretty much why rsync exists. The differences between filesystems and operating systems can be pretty significant at the lower levels, but at the file level a file is a file.

 

And if you want to get technical, rather than absolute and relative addresses, maybe we should talk about logical addresses. There is a lot going on at different levels of the system, such as drive firmware.

 

A little off topic - but thought I would ask this question of the rsync fanboys. :)

 

How does rsync verify that file written is same as source file?

 

In particular, does it do a read of the written data as the verification?

 

The reason I ask is based on your statement ... 

 

"In fact, copying between different computers is pretty much why rsync exists"

 

When I researched this a while back, I thought rsync was focused more on the data being received accurately from the remote computer, and it was not 100% clear that it verified the written.

 

There is a "bug" (Tom may disagree with this characterization) that may or may not have been fixed. I encountered it several times with a pesky problem I was having - in which my target drive was dropping from the array during a copy. You would think, in that scenario, that PARITY would be maintained properly even if the target disk failed. But that's not what happened. The write to parity DID NOT OCCUR or was not done correctly. But the OS never got an error or indication that anything went wrong, and the copy of subsequent files continued unabated. All prior and subsequent file copies update parity properly but that one I/O, the one that triggered the kick, was not reflected in parity.

 

The reason I found this was I was using Teracopy. Teracopy does one read of the data from the source computing the MD5 as it goes and copying the data to the destination. It then reads the destination file to compute its MD5. If the MD5s don't match it reports the problem. I was shocked when I saw Teracopy report the MD5 verification failure in the middle of a sea of files, but it lead me to my investigatory steps which led to my bug analysis above.

 

I have been using rsync and really like it a lot. It is considerably faster than Teracopy (leading in part to questioning how it works). I always check to see if a disk kicked before deleting the source files after using rsync. Since using it I have not had any disks kick from my array during rsync, so have not been able to prove or disprove if the "bug" is fixed or if rsync is detecting the problem if this bug remains.

Link to comment
3 minutes ago, bjp999 said:

How does rsync verify that file written is same as source file?

 

Ii think this explains it well:


 

Quote

 

rsync always uses checksums to verify that a file was transferred correctly. If the destination file already exists, rsync may skip updating the file if the modification time and size match the source file, but if rsync decides that data need to be transferred, checksums are always used on the data transferred between the sending and receiving rsync processes. This verifies that the data received are the same as the data sent with high probability, without the heavy overhead of a byte-level comparison over the network.

Once the file data are received, rsync writes the data to the file and trusts that if the kernel indicates a successful write, the data were written without corruption to disk. rsync does not reread the data and compare against the known checksum as an additional check.

 

 

 

Link to comment
21 minutes ago, johnnie.black said:

 

Ii think this explains it well:


 

 

 

 

That was my fear.

 

So, on a single computer and knowing the files are NOT going to already exist at the destination location, rsync is no better than cp or mv. Correct?

Link to comment
51 minutes ago, gubbgnutten said:

apparently not made things less confusing, so I'll give a more practical example a stab. Let's pretend we have three drives, a 4 bit parity disk, a 2 bit data disk (Disk 1) and a 3 bit data disk (Disk 2):

 

 I do understand how parity itself works and I understand how it works across drives basically odd or even. What I was trying to express is once I have made a duplicate set of files (rsync) on two different size disks with different formats (RFS v XFS) I am now able to remove the old disk and replace it with the new disc (same slot) and parity is still valid.

By removing the old disk did I just remove one of the bits being counted? By swapping the new disk with the old and removing the old disk (changing slots and removing the old disk completely) how is parity maintained? I have just swapped and removed one disks that was being counted in the odd even for each bit of data. I used to deal with parity issues when dealing with serial and parallel data communications. As I have tried to explain I tend to think in terms of an absolute address on each disk where a bit is counted as odd or even. any changes in the bit being counted or the quantity of bits being counted (number of disks) changes parity. 

 

As I have said I need to think about what relative addressing means.

Link to comment
4 minutes ago, bjp999 said:

So, on a single computer and knowing the files are NOT going to already exist at the destination location, rsync is no better than cp or mv. Correct?

 

AFAIK correct, after the initial copy you'd need too run rsync -c /source /destination  to compare checksums.

Edited by johnnie.black
Link to comment
1 hour ago, trurl said:

And if you want to get technical, rather than absolute and relative addresses, maybe we should talk about logical addresses. There is a lot going on at different levels of the system, such as drive firmware.

 

I may need to dig a little deeper to comprehend this.

 

Still haven't been able to solve the disk3 share issue does anyone have a way for me to address this?

Link to comment
10 minutes ago, johnnie.black said:

 

AFAIK correct, after the initial copy you'd need too run rsync -c /source /destination  to compare checksums.

 Which would cause a re-read of the source files. Is there a tool or way to structure such that the source files are read once, the destination files are written once, and the destination files are read once? (like Teracopy but doing it local on the server)

Link to comment
18 minutes ago, bjp999 said:

Is there a tool or way to structure such that the source files are read once, the destination files are written once, and the destination files are read once?

 

Don't know any, I use rsync. unless a disks redballs during the copy, I assume every file was copied OK, if there's a disk issue I already have checksums, so I run a check on that disk only.

Link to comment
35 minutes ago, tunetyme said:

 

 I do understand how parity itself works and I understand how it works across drives basically odd or even. What I was trying to express is once I have made a duplicate set of files (rsync) on two different size disks with different formats (RFS v XFS) I am now able to remove the old disk and replace it with the new disc (same slot) and parity is still valid.

By removing the old disk did I just remove one of the bits being counted? By swapping the new disk with the old and removing the old disk (changing slots and removing the old disk completely) how is parity maintained? I have just swapped and removed one disks that was being counted in the odd even for each bit of data. I used to deal with parity issues when dealing with serial and parallel data communications. As I have tried to explain I tend to think in terms of an absolute address on each disk where a bit is counted as odd or even. any changes in the bit being counted or the quantity of bits being counted (number of disks) changes parity. 

 

As I have said I need to think about what relative addressing means.

Sorry, where did you get the idea that parity remains valid if you remove a disk? Exactly what guide/procedure are you following?

 

With single parity you can reorder disks and still maintain valid parity, but parity won't remain valid after removing a disk (unless you actually write zeros to the entire raw disk before removing it).

Link to comment
48 minutes ago, tunetyme said:

 

 I do understand how parity itself works and I understand how it works across drives basically odd or even. What I was trying to express is once I have made a duplicate set of files (rsync) on two different size disks with different formats (RFS v XFS) I am now able to remove the old disk and replace it with the new disc (same slot) and parity is still valid.

By removing the old disk did I just remove one of the bits being counted? By swapping the new disk with the old and removing the old disk (changing slots and removing the old disk completely) how is parity maintained? I have just swapped and removed one disks that was being counted in the odd even for each bit of data. I used to deal with parity issues when dealing with serial and parallel data communications

You seem to be mixing in some other things here that are not really part of the conversion procedure in the wiki that we are discussing. Pretty much everything you are saying in this particular post does not maintain parity.

 

If you remove a disk without replacing it, parity is invalid since the missing bits from the missing disk were part of parity.

 

If you replace a disk, normally unRAID will rebuild the replacement using the parity calculation. The rebuild will have the exact same contents as the original, and the filesystem of the original is its contents. A replacement/rebuild does maintain parity. Actually, it maintains the replacement data disk so that it agrees with the existing parity. Parity is not written to rebuild a data disk.

 

If you replace a disk and set a new configuration instead of rebuilding, you have invalidated parity since the replacement wasn't rebuilt from parity. Parity still has the calculation from the bits of the original disk, and not the bits of the new (and not rebuilt) disk.

 

So, I guess you could say that you've got it right.:) You think the scenarios you describe would invalidate parity, and they would.

Link to comment
9 minutes ago, tunetyme said:

 

I may need to dig a little deeper to comprehend this.

 

Still haven't been able to solve the disk3 share issue does anyone have a way for me to address this?

I am having a hard time following some of these questions.

 

The principles are pretty easy.

 

1. Each disk is a long string of 1s and 0s. Parity is based on those 1s and 0s.

2. Whether parity is even or odd is not important. That is a low level decision with no impact on the end user.

3. If you format a disk, some of those 1s and 0s get updated. Parity is updated.

4. If you write a file to a disk, some of those 1s and 0s get updates. Parity is updated.

5. If a new disk is inserted in an array that is all 0s, it can be inserted into the array and parity remains valid (0s don't affect parity)

6. If a new disk is inserted in an array that is not all 0s, parity will be made invalid (possible with parity trust procedure)

7. If an existing disk is taken out of the array, Parity is made invalid (possible with parity trust procedure)

8. The order of the disks is unimportant to single parity. For the "parity2", ORDER is significant and rearranging that order will make parity2 invalid. (possible with trust parity procedure)

 

Using those principles, you should understand ...

1 - Parity is very dependent on not just what data is stored on a disk, but how its content is stored (what sectors, how filesystem works, etc.). Two disks can contain the exact same files, but yet they are not interchangeable with parity. 

2 - Parity can be used to rebuild a disk. It can't be used to rebuild a single file.

3 - If an original disk was RFS, and you rebuild it, the restored disk will be RFS. If you fool unRAID into thinking that disk is XFS, unRAID will report it as unmountable. It is then easy to click a button and reformat it as XFS. Be careful about reformatting disks you believe should contain data.

 

 

Link to comment

Maybe you're confusing replacing an old disk with a new disk, and just swapping two disks that are already part of the parity calculation. Swapping two disks that are already part of the parity calculation doesn't invalidate parity, though it would invalidate parity2 since the order of the disks is involved in the somewhat more complicated (Q) parity2 calculation.

 

If you need to replace a disk with a larger disk during the conversion process, you should let it rebuild the replacement before proceeding. Doesn't really matter whether the original had files on it, or whether you formatted the original first, a complete rebuild of the replacement would still be needed or you would invalidate parity.

 

I suppose there are other approaches you could use instead of rebuilding the replacement in this scenario if you no longer needed the files from the original, but they would all involve rebuilding parity instead.

Link to comment
11 minutes ago, bjp999 said:

4. If you write a file to a disk, some of those 1s and 0s get updates. Parity is updated.

4a. If you delete a file from a disk, some of those 1s and 0s get updates. Parity is updated.

 

Whether formatting, writing, or deleting, it's all writes as far as the bits are concerned.

Link to comment

This is the procedure that I have been doing:

 

File System Conversion

  1. If you have not run a successful Parity Check recently, do so now. You want to be certain that the array is perfect before you start
  2. Prepare a strategy for the order of drive conversion. Because you can't replace a larger drive with a smaller drive (unless the total file space used will fit on the smaller drive), you will have to order the conversions so that your largest data drive is first, then the next largest, then the next, with the smallest data drive being last. Obviously, the order doesn't matter for drives that are the same size.
  3. If your empty swap drive is not already installed and assigned, install it, and with the array stopped, assign it to the next empty drive slot (for our example, we will assign it to Disk 11)
  4. Click on the disk name of your swap drive (e.g. 'Disk 11') and change the format to XFS if it isn't already, then click Apply and Done
  5. If you have enabled User Shares (and most users have), go to Settings -> Global Share Settings and add your swap disk to ' Excluded disk(s) ' (for our example, we would put disk11)
  6. Start the array; your empty swap drive should show as 'Unmountable', and a Format button will be present
  7. Click the check box for formatting, then click the Format button; it takes a few minutes, says it's formatting; when done, array should show an additional drive, almost completely empty, formatted with XFS
  8. At the console or within a screen session, copy all data from your drive to be converted to the empty swap drive; use an rsync command based on the following, except change the drive numbers as appropriate for your system; type it exactly with the same slashes, upper and lower case matter; this command will take a long time but parity will be fully preserved; when complete, prompt should return with no errors showing; your array now has 2 drives that are identical except for their format, their file system (one of them is excluded from shares)
    rsync -avPX /mnt/disk10/ /mnt/disk11/
    (in our example, we are copying our large disk10 to the new and empty swap drive)
  9. This step is optional, as the previous rsync automatically checksums each transfer. But if you would like to verify that the end-to-end transfer was perfect, perform the next rsync command below; it will take a long time, and probably nothing will be copied unless the drive has been updated (see warning below!) since the full copy above; there's no progress info, it's over when the prompt returns
    rsync -rcvPX /mnt/disk10/ /mnt/disk11/
  10. Stop the array; we are now going to swap the drive assignments
  11. Click on Tools, then New Config, then Retain current configuration:, then select All, then check Yes I want to do this, click Apply then Done
    Important Warning! Doing a New Config will reset the file system type for all disks to Auto! While usually this is not a problem, especially with the latest unRAID, in some circumstances this can lead to unmountable disk(s). If that happens, then you need to select the correct file system for those disk(s). If in doubt, ask for help!
  12. Go back to the Main page and click on the dropdown for the swap drive (e.g. Disk 11) and unassign it (click on "unassigned" or "no device")
  13. Click on the dropdown for the other drive (the one being converted, e.g. Disk 10 to start), and reassign it as the physical drive of the swap drive, the drive that was empty (e.g Disk 11)
  14. Click on the dropdown for the slot of the swap drive (e.g. Disk 11) and reassign it to the physical drive that was being converted (e.g. Disk 10)
  15. Important! Click on each drive name (e.g. Disk 10 and Disk 11) and swap the file system format of the drive - if it's ReiserFS change it to XFS, if it's XFS change it to ReiserFS; it's important to swap the disk formats as well as the physical drive assignments
    At this point, you have now swapped the 2 drives, which is fine as they are identical (except for file system format); parity remains valid because the same drives are assigned, their slot does not matter; however if you have a second parity drive, it's now invalid!
  16. You should see all array disks with a blue icon, a warning that the parity disk will be erased, and a check box for Parity is already valid; IMPORTANT! click the check box, make sure it's checked to indicate that Parity is already valid or your Parity disk will be rebuilt! then click the Start button to start the array; it should start up without issue and look almost identical to what it looked like before the swap, with no parity check needed; however the XFS disk is now online and its files are now being shared as they normally would; check it all if you are in doubt
  17. If you are sure it's all fine, stop the array and click the empty swap disk slot (e.g. still Disk 11), and change the format to XFS, then click Apply and Done
  18. Start the array; the Format button should be available, format it now; when done, your empty disk slot now has a fresh and empty disk formatted with XFS and ready to fill again; your data drive has completed the conversion process and is already back online, with all files and shares intact, but formatted with XFS
  19. You are now ready to convert the next drive, so circle back to Step 8 and repeat these steps (Step 8 through Step 18), substituting your next drive to be converted; the empty and excluded swap disk slot will always be the same (e.g. always Disk 11 in our example), the other will change as you convert different data drives
When done, you have an empty XFS drive appended to your system, probably your smallest drive, and still excluded. It's up to you what you want to do with it. You can leave it as is, or you can unassign it and rebuild parity, or you can use the parity preserving remove-a-drive procedure, instructions elsewhere. Remember, it's probably still globally excluded from shares.
I do recommend that if you are going to try this procedure, you read through the steps and notes carefully until you fully understand them, and understand the importance of each detail. Missing a step or typing the wrong disk number could be disastrous!
If you wish, you can perform parity checks at any point during and after. I don't believe they are necessary, I only did one before starting, and I believe I did another after it was all done.
Warning! If you run the verification copy in Step 9, and it actually copies files, then it is likely you have a process still changing the drive! These newly copied files were not there for the Step 8 copy! You need to determine what process (Docker, plugin, VM, an external backup, or the Mover) made the changes to this drive, and stop it. Then you may need to run Step 9 again, because the process may have made even more changes to folders, after the Step 9 rsync process had moved past those folders. In summary, if the Step 9 copy actually copies any files, then you should probably repeat Step 9 until nothing is copied.
I've checked the above pretty carefully, if you see any errors, PLEASE let me know ASAP! I'm sure it can be improved. Steps 17 and 18 are a repeat of 4, 6, and 7, but it seemed safer to do it this way.
Please let us know of any issues or suggestions!
____________________________________________________________________________________________________________________________________________________________
Step 8
Everything was going according to the the above method until I missed a /.
rsync -avPX /mnt/disk10/ /mnt/disk11/ this is the proper command I missed the / slash after disk10.  This created another share someplace that we can't find.
 
I did steps 10 -16 where I became confused about parity. RobJ has updated this section as above to help with the confusion (see previous 2 pages). 
 
Step 16 was the start of the confusion about parity.
Edited by tunetyme
clarify
Link to comment
23 minutes ago, tunetyme said:
I did steps 10 -16 where I became confused about parity. RobJ has updated this section as above to help with the confusion (see previous 2 pages). 
 
Step 16 was the start of the confusion about parity.

Step 16? Pretty sure your confusion about what was happening started no later than step 14. :)

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.