Mover crashing server

mexicanmike · September 4, 2012

OOOh man, I am having this exact same problem and have tried almost everything.

I have eliminated the disk as the issue. Countless replacement disks have the same issue. It is also not the SATA port or SAS card, as I have also moved positions on the board, into different tray locations.

I am running completely stock, and will be attaching system logs on my next post.

Rich,

Did changing min. free space have any impact on this issue. I am about to try changing this to a very large size to see if it helps, but I honestly think I tried this already.

Read this:

http://lime-technology.com/wiki/index.php/Un-Official_UnRAID_Manual#Min._Free_Space

I have read that, In addition I have set all my shares to have a min. free space 0f 60GB (60000000). Was able to use cache drive yesterday for a move of around 100gb to one of the shares. Right after, I attempted to use the cache drive for about 250GB to a different share, and it crashed. Reboot, envoke mover, crashed again.

I really dont think that it is the share, I have rebuilt these shares from scratch.I was on B14, copied all the data off, recreated the shares, and moved to 5.0rc-2, this same thing is happening.

Tried to get logs, but it keeps dumping them, any suggestions?

Frank1940 · September 4, 2012

OOOh man, I am having this exact same problem and have tried almost everything.

I have eliminated the disk as the issue. Countless replacement disks have the same issue. It is also not the SATA port or SAS card, as I have also moved positions on the board, into different tray locations.

I am running completely stock, and will be attaching system logs on my next post.

Rich,

Did changing min. free space have any impact on this issue. I am about to try changing this to a very large size to see if it helps, but I honestly think I tried this already.

Read this:

http://lime-technology.com/wiki/index.php/Un-Official_UnRAID_Manual#Min._Free_Space

I have read that, In addition I have set all my shares to have a min. free space 0f 60GB (60000000). Was able to use cache drive yesterday for a move of around 100gb to one of the shares. Right after, I attempted to use the cache drive for about 250GB to a different share, and it crashed. Reboot, envoke mover, crashed again.

I really dont think that it is the share, I have rebuilt these shares from scratch.I was on B14, copied all the data off, recreated the shares, and moved to 5.0rc-2, this same thing is happening.

Tried to get logs, but it keeps dumping them, any suggestions?

Your moves are massive amounts of data. Much more than a single file move would normally represent. (Many users are using unRAID for media storage and 50GB is as large a file as Bluray ISO will generate.)

You should now look very carefully at this sentence in the "Min. Free Space" section that I referred you to:

"Note that unRAID will still place files on the disk if the split level does not allow the files to be placed on another disk with more free space. "

The next section in the WIKI has to do with the "Split level". You should read it. Many users find that split level is a difficult concept to get their heads around. Min Free Space and Split Level work together to determine where files have to be placed in the array.

I suspect that your split level setting is forcing unRAID to attempt to force the storage all of the files in the 'move' onto a single disk and there is not room for it to fit. In that case the move will fail!

dgaschk · September 4, 2012

OOOh man, I am having this exact same problem and have tried almost everything.

I have eliminated the disk as the issue. Countless replacement disks have the same issue. It is also not the SATA port or SAS card, as I have also moved positions on the board, into different tray locations.

I am running completely stock, and will be attaching system logs on my next post.

Rich,

Did changing min. free space have any impact on this issue. I am about to try changing this to a very large size to see if it helps, but I honestly think I tried this already.

Read this:

http://lime-technology.com/wiki/index.php/Un-Official_UnRAID_Manual#Min._Free_Space

I have read that, In addition I have set all my shares to have a min. free space 0f 60GB (60000000). Was able to use cache drive yesterday for a move of around 100gb to one of the shares. Right after, I attempted to use the cache drive for about 250GB to a different share, and it crashed. Reboot, envoke mover, crashed again.

I really dont think that it is the share, I have rebuilt these shares from scratch.I was on B14, copied all the data off, recreated the shares, and moved to 5.0rc-2, this same thing is happening.

Tried to get logs, but it keeps dumping them, any suggestions?

Your moves are massive amounts of data. Much more than a single file move would normally represent. (Many users are using unRAID for media storage and 50GB is as large a file as Bluray ISO will generate.)

You should now look very carefully at this sentence in the "Min. Free Space" section that I referred you to:

"Note that unRAID will still place files on the disk if the split level does not allow the files to be placed on another disk with more free space. "

The next section in the WIKI has to do with the "Split level". You should read it. Many users find that split level is a difficult concept to get their heads around. Min Free Space and Split Level work together to determine where files have to be placed in the array.

I suspect that your split level setting is forcing unRAID to attempt to force the storage all of the files in the 'move' onto a single disk and there is not room for it to fit. In that case the move will fail!

None of these issues are causing the server to crash. See here: http://lime-technology.com/forum/index.php?topic=9880.0

Use the telnet-tail method to collect the syslog.

ThOr101 · January 21, 2013

I had this same problem, and decided to dive as deep as I possibly could into it, and I found a "solution".

Edit your /usr/local/sbin/mover script

Change this line:

-exec rsync -i -dIWRpEAXogt --numeric-ids --inplace {} /mnt/user0/ \; -delete

to this line:

-exec rsync -i -dIWRpEAogt --numeric-ids --inplace {} /mnt/user0/ \; -delete

When rsync tries to set the extended attributes of the file on the /mnt/user0 files system it coredumps / kernel panics. I'm not exactly sure what extended attributes need to be brought through when the cached file is moved, I don't think I'm using any of them, but I guess time will tell.

One other thought while looking at the mover script. If you have been having these problems, you have probably lost a lot of files. The mover script will delete the file without a successful copy:

find "./$Share" -depth $ \( -type f ! -exec fuser -s {} \; $ -o $ -type d -empty $ \) -print \

-exec rsync -i -dIWRpEAXogt --numeric-ids --inplace {} /mnt/user0/ \; -delete

I would probably suggest to the owner of this script to use --remove-source-files and allow rsync to ensure the file has been properly moved, and let it remove the original. This just deletes files when there is a failure.

I hope this helps.

--THOR!

limetech · January 21, 2013

I had this same problem, and decided to dive as deep as I possibly could into it, and I found a "solution".

Edit your /usr/local/sbin/mover script

Change this line:

-exec rsync -i -dIWRpEAXogt --numeric-ids --inplace {} /mnt/user0/ \; -delete

to this line:

-exec rsync -i -dIWRpEAogt --numeric-ids --inplace {} /mnt/user0/ \; -delete

When rsync tries to set the extended attributes of the file on the /mnt/user0 files system it coredumps / kernel panics. I'm not exactly sure what extended attributes need to be brought through when the cached file is moved, I don't think I'm using any of them, but I guess time will tell.

One other thought while looking at the mover script. If you have been having these problems, you have probably lost a lot of files. The mover script will delete the file without a successful copy:

find "./$Share" -depth $ \( -type f ! -exec fuser -s {} \; $ -o $ -type d -empty $ \) -print \

-exec rsync -i -dIWRpEAXogt --numeric-ids --inplace {} /mnt/user0/ \; -delete

I would probably suggest to the owner of this script to use --remove-source-files and allow rsync to ensure the file has been properly moved, and let it remove the original. This just deletes files when there is a failure.

I hope this helps.

--THOR!

No do not make these changes. You want to preserve extended attributes for a number of reasons:

a) for AFP, netatalk3 will store a CNID in an extended attribute

b) the DOS hidden/system/archive bits can be in there (though not by default, but can be turned on by adding "store dos attributes = yes" line in config/smb-extra.conf).

c) Active Directory absolutely requires extended attributes

d) some applications may use extended attributes.

Granted they can be turned off without much grief for most users, but to avoid future problems they should not be. Preservation of extended attributes is precisely why 'rsync' is used instead of a simple 'mv' command.

Please provide link to thread with kernel panic bug posted.

Secondly, the '-delete' option will not get executed if the preceding 'rsync' is not successful.

ThOr101 · January 25, 2013

So, I just tested it, and the -delete does indeed delete the file when rsync core dumps. Or something is deleting the file.

I appreciate your offer to send you the trace / dump when rsync fails. But... are you going to tell me to uninstall all of my add-ons and other seemingly useless things that don't really have an effect on some funky setting on the shared file system?

I'm happy to upload it from my system as is. If you see a problem that points to one of the add-ons, I guess I can go for it. But I don't have the time to deconstruct, and reconstruct my system just to make it core dump again.

Still want the stack trace / core dump with the kernel panic? There it is as an attachment.

Thanks for looking into this. Best of luck trying to figure out what happened.

Some history of my system is at one point one of the drives did fill up. I moved data around, but I would imagine that the mover ran smack dab into this problem at one point.

I'm not sure what could have gotten out of whack. I did the resiserfsck on all the drives, including the cache. No issues found.

kernelpanic.txt

Mover crashing server

Recommended Posts

mexicanmike

Link to comment

Top Posters In This Topic

Popular Days

Top Posters In This Topic

Popular Days

Frank1940

Link to comment

dgaschk

Link to comment

ThOr101

Link to comment

limetech

Link to comment

ThOr101

Link to comment

Join the conversation