rsync/copy drive for backup crashes v5 release


Recommended Posts

Hi - I am trying to copy the contents of a 3TB array drive and a 750GB cache drive to replace them. I have tried both rsync (-avh) and copr (-ar). This has resulted in crashing the unraid server, the ssh window closes automatically, server is unreachable from client computer and using a terminal connected physically to the machine I saw some OOM errors. Cannot take a syslog since the system is non-responsive at this point. Searching showed this bug/issue was seen by few other users and pointed to redhat distribution.

 

I have tried various ways to copy the disk contents, splitting the overall 2.5TB content into 3 folders and then copying, etc. The machine crashes after about 15 mins of trying to copy. I currently have 8 GB ram, reduced it to 2GB, and tried various solutions to see whether this was a memory related bug but doesn't help. Any suggestions on what I could try to copy the contents?

 

Link to comment

If you are using a windows machine as a destination for the copies, teracopy works pretty well, it has file integrity verification as an option as well.

 

However... I must ask, why are trying to copy all the data? If you want to simply replace the drives, then the easiest method is to run a parity check and look at the smart reports for any failing drive, and assuming all your drives are healthy just swap out the old drive with the new tested drive and let it rebuild, lather rinse repeat including the parity check onward for the second drive. That way the old drives still contain all your data, and can be used as backups if you don't already have the data backed up elsewhere.

 

Maybe a little more detail on your situation would allow a better answer.

Link to comment

Jon - Before I realized the drive in question was giving I/O errors, the system already had more than 10,000 sync parity corrections. I assume the system had I/O errors while a bunch of data was being written to the bad drive. I can afford to lose 'some' data on the bad drive and I wanted to retrieve the data before running reiserfsck --rebuild-tree. Reconstructing data from a 4TB parity drive can take a week, I am trying to reduce rebuilding multiple drives serially which also has reduced server throughput.

 

If you are using a windows machine as a destination for the copies, teracopy works pretty well, it has file integrity verification as an option as well.

 

However... I must ask, why are trying to copy all the data? If you want to simply replace the drives, then the easiest method is to run a parity check and look at the smart reports for any failing drive, and assuming all your drives are healthy just swap out the old drive with the new tested drive and let it rebuild, lather rinse repeat including the parity check onward for the second drive. That way the old drives still contain all your data, and can be used as backups if you don't already have the data backed up elsewhere.

 

Maybe a little more detail on your situation would allow a better answer.

Link to comment

... Reconstructing data from a 4TB parity drive can take a week, I am trying to reduce rebuilding multiple drives serially which also has reduced server throughput...

Are you sure you know how unRAID works? This sounds like something you might say about a different RAID system.
Link to comment

I tend to think I know at least a little about unraid but there is always more to learn, so please let me know if I missed something here.

 

I typically stream Blu-ray iso with a wired gbit connection. If the parity is being rebuilt the streaming stutters due to reduced i/o. I have read others also see similar situation...  So to minimize this downtime I am looking to replace 2-3 drives at one go. I understand the array is unprotected during this time but I am willing to take the risk.

 

Now back to the original question, how do I copy drive without getting oom errors? Any particular hardware could be contributing to this?

... Reconstructing data from a 4TB parity drive can take a week, I am trying to reduce rebuilding multiple drives serially which also has reduced server throughput...

Are you sure you know how unRAID works? This sounds like something you might say about a different RAID system.

 

Link to comment

... Reconstructing data from a 4TB parity drive can take a week, I am trying to reduce rebuilding multiple drives serially which also has reduced server throughput...

Are you sure you know how unRAID works? This sounds like something you might say about a different RAID system.

The reason I asked this question is because typically rebuilding a data drive from a 4TB parity drive plus the data on the other drives should only take several hours, not a week.

 

Also, once you have rebuilt a data drive, there is no need to rebuild any of the other drives, since all data disks are completely independent. In fact, if there was anything wrong with any of the other disks, you would not be able to successfully rebuild the first disk.

 

unRAID is not RAID. Data is not striped.

 

Link to comment

A simple parity check with a 4TB drive without writing data takes takes ~ 3 days at 20-25MBps and writing/reconstructing takes more than that, ~ 5 days. Isn't that normal? If it isn't, there are some serious hardware bottlenecks. Can someone chime in with their typical parity check and reconstruction times as well?

 

Thanks!

 

... Reconstructing data from a 4TB parity drive can take a week, I am trying to reduce rebuilding multiple drives serially which also has reduced server throughput...

Are you sure you know how unRAID works? This sounds like something you might say about a different RAID system.

The reason I asked this question is because typically rebuilding a data drive from a 4TB parity drive plus the data on the other drives should only take several hours, not a week.

 

Also, once you have rebuilt a data drive, there is no need to rebuild any of the other drives, since all data disks are completely independent. In fact, if there was anything wrong with any of the other disks, you would not be able to successfully rebuild the first disk.

 

unRAID is not RAID. Data is not striped.

Link to comment

Great, thanks! I started the parity check this morning, will check this evening on the reported speed. When I am accessing the shares I see only about 20-25MBps, maybe uninterrupted it is more ...

 

Here is a recent poll with parity check speeds. I only have 3TB parity, but my speed came out at 107 MB/sec last check, a little under 8 hours.

Link to comment

Your sig says you removed Simplefeatures. If you do have Simplefeatures installed, or have replaced it with Dynamix, both of those are known to affect parity check speeds while you have the webGUI refreshing. Dynamix has a setting to stop it from automatically refreshing.

Link to comment

I have simplefeatures currently running, thanks for that pointer as well. I have couple of cheap sil3132 cards that I got from monoprice, they could be a bottleneck as well.

 

Your sig says you removed Simplefeatures. If you do have Simplefeatures installed, or have replaced it with Dynamix, both of those are known to affect parity check speeds while you have the webGUI refreshing. Dynamix has a setting to stop it from automatically refreshing.

Link to comment

Just to confirm the sort of speeds others mentioned I find I get 100+MBps average during parity create/check. 

 

I have just getting ready to switch in a new 6TB disk to use as parity.  When pre-clearing it I found that I was getting around 15-20% increase in speed compared to what I had seen with my 3TB drives - probably because the recording density has increased.  However I do not expect the overall parity check speed to increase as that is normally determined by the slowest drive.

Link to comment

I have simplefeatures currently running, thanks for that pointer as well. I have couple of cheap sil3132 cards that I got from monoprice, they could be a bottleneck as well.

Also, Simplefeatures has some incompatibilities with the official unRAID 5 releases, and it is no longer supported by the authors. Dynamix is the improved replacement for Simplefeatures.
Link to comment

Now back to the original question, how do I copy drive without getting oom errors? Any particular hardware could be contributing to this?

 

Chances are there are tons of files and you will need to do this in chunks, There's only so much low memory.

Start unRAID in safe mode without any plugins and addons.

 

Try a drive at a time or specific directories at a time.

 

do a sync then drop the cache before and after each chunk of rsync.

 

sync

echo 3 > /proc/sys/vm/drop_caches

 

 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.