Rsync issue [RESOLVED]


Recommended Posts

Hey there,

 

i am having a weird issue with an rsync of files from one unraid box to another,

 

below is the command i am running

 

Quote

rsync -av --progress -e "ssh -o StrictHostKeyChecking=no" "$fromFolder" "$toFolder" --log-file="$toFolder/rsync-log.txt"

 

and the error i receive is

 

Quote

rsync: connection unexpectedly closed (7188736497 bytes received so far) [receiver]
rsync error: error in rsync protocol data stream (code 12) at io.c(226) [receiver=3.1.3]
rsync: connection unexpectedly closed (99373 bytes received so far) [generator]
rsync error: unexplained error (code 255) at io.c(226) [generator=3.1.3]

 

its weird though re-running it will make it progress a little further, sometimes i will run it twice and it will progress another file, sometimes it will take 6 tries to progress to the next file. the size of the files range from 800mb to 6gb.

 

my googlefu has again let me down does anyone have any suggestions of what could be the cause of this issue?

 

Thanks in advance!

 

edit: also attached diagnostics report incase it helps

mediabox-diagnostics-20200131-2044.zip

Edited by phyzical
Link to comment

after trying to resolve this trying various weird things for the last 3-4 days my current budge is to brute force it :D while i continue trying to find a soultion..

 

Quote

#!/bin/bash

while true

do

     if  RSYNCCOMMAND; then

             echo "rsync completed normally"

            exit

       else

           echo "rsync failure. Retrying..."

          sleep 1

      fi

done

 

bit by bit its getting through the files....

 

Link to comment

I thought I replied to you but maybe forgot to post. What you are seeing is very typical of network issues e.g. random connection dropping. You should try to check connections on both end (or at least check the syslog to see if your connections are dropping). I have had overheating NIC randomly dropping off and back on in the past.

Brute force is ok but if you don't resolve the connection issues, you may end up not being able to trust the content of the copy.

Link to comment

hey again dasi,

 

thanks for the reply.

 

in regards to just brute forcing it, i tried playing back a couple of the files that took 3-4 --partial rsyncs to make it across and they seem okay. Shouldnt rsync see files differ anyway and just recopy or is there still a risk that it think 100% made it across and just didnt?

 

yeah i had a feeling it was network related. it is about 40tb of content the first time i rsynced it went fine not this issues once from a -> b back when a was just a windows box. now i am going b -> a and this issue didnt start occuring until about 25% of the way through.

 

i did check a few cables reset one of the switches thinking it might have been the cause. i will try cycling all the network and giving all the cables a check.

 

If it was an overheating NIC would the syslog reveal this? if not how would i go about diagnosing this.

 

Thanks again!

Edited by phyzical
Link to comment

so i tried checking all the cabls, power cycled my router and two switches between the two pcs

 

below is the syslog summary, i do see mentions to NIC but i am not too familiar on if anything is wrong in it

 

Quote

Jan 31 23:18:46 Mediabox kernel: e1000e: eth0 NIC Link is Down
Jan 31 23:18:46 Mediabox kernel: bond0: link status definitely down for interface eth0, disabling it
Jan 31 23:18:46 Mediabox kernel: device eth0 left promiscuous mode
Jan 31 23:18:46 Mediabox kernel: bond0: now running without any active interface!
Jan 31 23:18:46 Mediabox kernel: br0: port 1(bond0) entered disabled state
Jan 31 23:19:05 Mediabox kernel: e1000e: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
Jan 31 23:19:05 Mediabox kernel: bond0: link status definitely up for interface eth0, 1000 Mbps full duplex
Jan 31 23:19:05 Mediabox kernel: bond0: making interface eth0 the new active one
Jan 31 23:19:05 Mediabox kernel: device eth0 entered promiscuous mode
Jan 31 23:19:05 Mediabox kernel: bond0: first active interface up!
Jan 31 23:19:05 Mediabox kernel: br0: port 1(bond0) entered blocking state
Jan 31 23:19:05 Mediabox kernel: br0: port 1(bond0) entered forwarding state
Jan 31 23:19:09 Mediabox kernel: e1000e: eth0 NIC Link is Down
Jan 31 23:19:10 Mediabox kernel: bond0: link status definitely down for interface eth0, disabling it
Jan 31 23:19:10 Mediabox kernel: device eth0 left promiscuous mode
Jan 31 23:19:10 Mediabox kernel: bond0: now running without any active interface!
Jan 31 23:19:10 Mediabox kernel: br0: port 1(bond0) entered disabled state
Jan 31 23:19:13 Mediabox kernel: e1000e: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
Jan 31 23:19:13 Mediabox kernel: bond0: link status definitely up for interface eth0, 1000 Mbps full duplex
Jan 31 23:19:13 Mediabox kernel: bond0: making interface eth0 the new active one
Jan 31 23:19:13 Mediabox kernel: device eth0 entered promiscuous mode
Jan 31 23:19:13 Mediabox kernel: bond0: first active interface up!
Jan 31 23:19:13 Mediabox kernel: br0: port 1(bond0) entered blocking state
Jan 31 23:19:13 Mediabox kernel: br0: port 1(bond0) entered forwarding state

 

Link to comment

Definitely NIC issue. Within the space of 1 minute, your NIC went down twice (and likely multiple times every time you run the rsync script).

 

You can try restarting the server. That can help in some cases.

Failing that, you need a new NIC (but hopefully a restart will be fine).

 

Jan 31 23:18:46 Mediabox kernel: e1000e: eth0 NIC Link is Down
...
Jan 31 23:19:05 Mediabox kernel: e1000e: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
...
Jan 31 23:19:09 Mediabox kernel: e1000e: eth0 NIC Link is Down
...
Jan 31 23:19:13 Mediabox kernel: e1000e: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx

 

 

  • Thanks 1
Link to comment

aw fack,

 

it has already been restarted multiple times since this issue started occuring so i think it may be busted.

 

checking the syslog again it has only done the double up and down once in the last hour.

 

But i do have a spare pci NIC so i will chuck that in tommorow morning and see if the problem goes away.

 

fingers crossed!

 

Thanks again!

Link to comment

hey @testdasi,

 

so i tried the pci NIC, the rsync kept repeating the same issue, so i tried disabling the onboard NIC and still the issue occured.

 

So tried a different lan cord for the destination pc  to the first switch and another lan cord for the host pc to the other switch still no change.do you know of a way i can test the NIC external of the rsync command so i can try to rule out the data

 

So if its not the NIC do you happen to have any more suggestions as to what could be causing this issue?

 

Thanks!

Link to comment

so a quick update.

 

i decided to try mounting the smb share of the files im trying to rsync and the issue seems to have stopped occuring although i have noticing every now and then i see

 

Quote

WARNING FILE.ext failed verification -- update discarded (will try again)

 

so i guess whatever it is is still occurring but nowhere near as much as it was with an rsync via ssh

Link to comment

haha so looks like i spoke too soon. although this time it may be unrelated thought i would mention it just incase..

 

my script threw "Transport endpoint is not connected", and i was unable to ls the /mnt/user folder with it returning the same error. googling suggested checking the filesystem so i ran xfs_repair on all the disk mounts. nothing needed fixing.

 

rebooted and it was all back and the script rsyncing with the mount method is continuing on

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.