Jump to content

Defrag XFS array drives


ljm42

Recommended Posts

43 minutes ago, Hastor said:

is there a reliable way to get fragmentation stats on an XFS drive?

No

 

44 minutes ago, Hastor said:

I have a lot and it's much slower otherwise

This is your first fill? Else you should consider a bigger cache.

 

45 minutes ago, Hastor said:

one 16TB has 12.7GB, the other has 2TB and is filling up more currently. The 10's each have 4TB free

Ok, and sometimes the 10's spin while the second 16TB (with the 2TB fill) is filled up? This is really strange.

 

46 minutes ago, Hastor said:

I write anywhere from 90-110MB/s over the network.

Ok, but you are limited by the network and not the disk, correct? You should consider 2.5G ethernet. It's really nice having 2.5x more speed and these cards aren't too expensive I think.

Link to comment
14 hours ago, mgutt said:

This is your first fill? Else you should consider a bigger cache.

This is my first fill, migrating from an older storage system. I have two 500GB SSDs that will be my cache once it is operating normally, but I'm moving more than 500GB at a time right now.

 

14 hours ago, mgutt said:

Ok, and sometimes the 10's spin while the second 16TB (with the 2TB fill) is filled up? This is really strange.

I'm using "turbo write" aka Reconstruct Write. Just for my first fill to speed things up. Otherwise I only write at about 65MB/s, which I've found in other discussions is quite normal with parity in place and no cache. I'm sure you understand it all, but this is how I understand it to work. Normally it would keep other drives spun down, read the parity info, adjust it for your new data, and re-write it, meaning the parity gets read and re-written on each write, slowing down a mechanical drive a good bit. With Reconstruct Write, it reads from all drives to form the parity data, while only writing to the parity drive. So faster, but more drives taking wear (but the parity drives taking a bit less wear as they are only writing). It makes sense that one 16TB drive has more data than the other as it has been part of the array longer. What's weird is that when reconstruct write is writing in the 10-16TB range, it of course doesn't spin up the 10TB drives as there's nothing in that range to read - however, even though I'm putting my initial files on the new 16TB, some files spin up all drives when being written, meaning they are being placed in the 0-10TB area of the drive, but others only spin up the 16TB drives, meaning they are writing to the 10-16TB area, and there's nothing parity needs from the 10TB drives in that range. However since that new 16TB drive only has 2TB on it, why are the files randomly going back and forth as far as what area of the hard drive they are being written to? One will spin up all drives, the next will only spin up the 16TB ones, then the next might spin up all. It definitely doesn't appear to be placing the files all in a row on the new disk, even though I'm copying them to the array over network one at a time, single-file.

 

14 hours ago, mgutt said:

Ok, but you are limited by the network and not the disk, correct? You should consider 2.5G ethernet. It's really nice having 2.5x more speed and these cards aren't too expensive I think.

 

Yes, my Network is just gigabit, so it limits me to around 115MB/sec on average. I am using a very entry level system, but it's all plenty for me, especially once the cache is in place, and I'll disable reconstruct write. I won't care that it takes a bit to move the cache over while I'm sleeping. I won't be moving more files than my cache holds very often, and gigabit is more than enough for me currently. I'm mainly trying to protect a large amount of data that was once on a Drobo. I'm definitely getting faster speeds than the Drobo would sometimes give me over USB 3, dropping to 5-10MB/sec at times (even with different drives and a different Drobo). It also gives you no visibility to its content, you just trust it and its redundancy, but your drives are useless without a Drobo to read them. My extended protection on my Drobo is about to run up, and they haven't made consumer models in a couple years (no announcement but they've been out of stock). Unraid is working out a lot better for me, and I'm happy with the setup I have for now. I'm only using it for NAS for my personal stuff currently and I'm loving it. I just wanted to do anything I can to help keep it healthy, including any defragmentation I should be doing. Most of this data is seldom accessed, but the collection of data is accessed pretty often for a couple things at a time.

Edited by Hastor
Link to comment
11 minutes ago, Hastor said:

some files spin up all drives when being written, meaning they are being placed in the 0-10TB area of the drive, but others only spin up the 16TB drives, meaning they are writing to the 10-16TB area, and there's nothing parity needs from the 10TB drives in that range.

I think this is related to the XFS filesystem and not Unraid. I think it re-arranged some data so some sectors at the beginning became free.

Link to comment
3 minutes ago, mgutt said:

I think this is related to the XFS filesystem and not Unraid. I think it re-arranged some data so some sectors at the beginning became free.

Makes sense, this is honestly my first experience with xfs on mechanical drives. After the full move of 35TB, I'll run a parity check. As long as that's good, I'm happy!

Link to comment
  • 7 months later...

Just wanted to share a simple bash script that, given you have as many or fewer drives in your pool than days in a given month, will issue a defrag command on each drive on its corresponding day of month (i.e. 1st of the month, defrag drive 1). This is my, albeit kludgy, way of defragging every drive once per month without putting too much strain on the system. If you've got a better way, please share! **I have no evidence this will speed up a system!

 

#!/bin/bash

# Get the current day of the month
dom=$(date +"%-d")

# Get the drive to match
drive=/dev/md$dom

if [[ $(find $drive) ]]; then
    echo "Defragging $drive"
    xfs_fsr -v $drive
else
    echo "No drive found at position $drive"
fi

 

 

  • Like 1
Link to comment
  • 1 year later...
19 hours ago, KRDucky said:

This script does not work on my system. My mtab lists drives as mdXpX not mdX. It fails to find my XFS array. none of my SSDs are XFS. 

 

I added this to the first post to bring it up to date:

# for 6.12 and newer use /dev/mdXp1
# for 6.11 and older use /dev/mdX

 

For 6.12+ you would need to change this line in the previous script from:

drive=/dev/md$dom

to:

drive=/dev/md${dom}p1

 

  • Like 1
Link to comment
  • 3 weeks later...
  • 4 months later...
  • 3 weeks later...

There is a nice script over at this post: https://forums.unraid.net/topic/98033-xfs-extended-defragmentation/

 

It uses the user scripts plugin to be able to schedule for what ever time frame fits your pools use case.

 

I updated it to work with 6.12.6 ( I believe it will work on 6.12.8 also). Either way figured I would share it here since this is the first article I found talking about this topic from google.

 

 Updated script as of 3/28/24

 

Edited by DoINeedATag
  • Like 1
Link to comment
  • 1 month later...
On 9/27/2017 at 2:03 PM, SSD said:

I know that general recommendation is, when a drive is being replaced, to remove it and let unRAID rebuild it onto a new disk. I seldom/never do that. I am normally replacing at least 2, and sometimes 3 or 4 at a time. And I will install them outside the array (making sure unRAID partitioning and formatting is applied), and COPY the files from the array disks to the replacement disks. This results in completely defragmented disks, rather than a clones of the existing fragmented disks. After all the copying is done (which can be done in parallel since there is not parity involved), I do a new config, and rebuild parity. Overall I find this a more efficient process, with defragging just being a nice side effect.

 

Of course all of this is done only after a parity check and inspection of the SMART reports / notification emails to ensure that the drives are healthy. If done properly, it introduces no additional risk with minimal drawbacks, and a huge advantage in getting through the upgrade cycle quicker.

 

One of the big benefits of defragmentation is enhanced ability to recover data in the event of some sort of disk corruption event. Unfortunately, XFS is a file system that is particularly poor at being able to recover from such events, defragged or not. So I would question the need to run the defrag tool. Beating the crap out of the drive for 48 hours may not be worth it! With media, the data is typically read back at a slow rate compared to the disk speed, and it will be able to easily keep up with your streaming even with some head movement.

 

Also, when moving to XFS I did some digging into how XFS works,. We tend to think of a drive having a file allocation table that maintains the file linkages. But my understanding of XFS is that it is broken up, internally, into multiple blocks each having its own file allocation table. This tends to keep files within a band of cylinders, and minimize the bad side effects of the heads moving wildly back and forth over the disk surface to read a badly fragmented file.

 

All in all, I would not be overly concerned with disk fragmentation in an XFS array. And if you are worried about it, next time you want to upsize some disks, use my method. Feel free to ask and I can give more details.


Sorry to quote such an old post, but I'm starting to see the effect of fragmentation on 1 of my drives 48%. And the most logical step to me as I always have a spare drive ready on hand is indeed to copy the contents over to the new drive.

As mentioned before my biggest concern is messing thing up. What I was hoping to do is copy the contents of lets say Disk2 to the spare disk via Unassigned Devices, then stop the array and swap / assign the old drive out for the new.
 

It all sounds to simple to be true, would this indeed work? Would it trigger a Parity-check etc?

Link to comment

If it is part of a protected array, then you don't want to just copy it. Parity doesn't know what files are, it's just protecting the data on the drive, whatever it may be. If you rearrange the data, you would have to tell it to recalculate the parity - I'm not sure of the exact steps of that.
Rather than copying to a new drive and using it, I think it would be better to copy to that new drive, then copy back onto the array. I still don't like that the data will be unprotected for a bit while it only exists on the external drive, and you'd need to copy to that drive through the array so that parity is calculated.

I'd also love to have an easier way to defrag. My array has large drives and I can't afford any place to temporarily store the data, but it's getting pretty full and been in use a long time.

Link to comment
On 5/3/2024 at 9:14 AM, Hastor said:

If it is part of a protected array, then you don't want to just copy it. Parity doesn't know what files are, it's just protecting the data on the drive, whatever it may be. If you rearrange the data, you would have to tell it to recalculate the parity - I'm not sure of the exact steps of that.
Rather than copying to a new drive and using it, I think it would be better to copy to that new drive, then copy back onto the array. I still don't like that the data will be unprotected for a bit while it only exists on the external drive, and you'd need to copy to that drive through the array so that parity is calculated.

I'd also love to have an easier way to defrag. My array has large drives and I can't afford any place to temporarily store the data, but it's getting pretty full and been in use a long time.

Sorry for the late reply, forgot to subscribe to thread.

 

Thankyou for your detailed response it's been very helpful.

 

I do like the sound of manually copying as you mentioned, I have 2nd drive too, so I might backup to both drives via UD and run a script that also runs a bunker check on the transfered files.

 

But all that efforts makes the defrag command sound much more simpler.

 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...