hawihoney Posted July 10, 2013 Share Posted July 10, 2013 I do have some problems with this new release, sorry. Something I had in the past came back with 16c. It has to do with writing and allocating huge files. These files are 30GB/40GB/50GB in size and space is pre-allocated for these files by ImgBurn (a Windows application). Now with 16c the pre-allocation will be interrupted and an unknown network error is shown on my Windows 8 machines. There's nothing in the syslog. I know this is very vague and I don't have any knowledge about the changed parameters in 16c, but it looks to me that some timeouts fire earlier than with 16b. Link to comment
Joe L. Posted July 10, 2013 Share Posted July 10, 2013 This might be an issue... Perhaps a side effect of 16c, or a side effect of the free space available on the user's server. I do have some problems with this new release, sorry. Something I had in the past came back with 16c. It has to do with writing and allocating huge files. These files are 30GB/40GB/50GB in size and space is pre-allocated for these files by ImgBurn (a Windows application). Now with 16c the pre-allocation will be interrupted and an unknown network error is shown on my Windows 8 machines. There's nothing in the syslog. I know this is very vague and I don't have any knowledge about the changed parameters in 16c, but it looks to me that some timeouts fire earlier than with 16b. Link to comment
Joe L. Posted July 10, 2013 Share Posted July 10, 2013 I do have some problems with this new release, sorry. Something I had in the past came back with 16c. It has to do with writing and allocating huge files. These files are 30GB/40GB/50GB in size and space is pre-allocated for these files by ImgBurn (a Windows application). Now with 16c the pre-allocation will be interrupted and an unknown network error is shown on my Windows 8 machines. There's nothing in the syslog. I know this is very vague and I don't have any knowledge about the changed parameters in 16c, but it looks to me that some timeouts fire earlier than with 16b. Can you post the output of: df and also describe which user-share you are writing to, and how that share is configured. (which allocation method, how you have the min-free-space set for that share. If using a "disk" share, just let's see if it has the space needed as allocation method would not be applicable) also please post the output of: ulimit -a Link to comment
WeeboTech Posted July 10, 2013 Share Posted July 10, 2013 I used to have this kind of a problem on 'very full' file systems on slower 5400 RPM drives. As I would look at the machines. No writes were occurring on the windows station, however the drive was scrambling on the output drive with no activity on the parity drive. I'm guessing it was a reiserfs superblock data space search or something. Once space was found, there would either be writes or the timeout occurred on the windows samba client. After the timeout. I would attempt to do it again, and it would work right away the second time. While this doesn't point to the problem, it can possibly provide a test to see if it's the same problem. I.E. initial reiserfs allocation timeout. Link to comment
mejutty Posted July 10, 2013 Share Posted July 10, 2013 I used to see this error quite a lot. If it is the same error as I used to see the following will be true, you don't have a cache drive, the folder where you are writing the files to contain more than 40 files. What I found is that the allocation of the file is happening in the background it is just happening to slow and does not allocate the whole space in time before windows times out. http://lime-technology.com/forum/index.php?topic=20757.15 If you create a new empty folder and you then try the copy again into the newly created folder that contains no files it should copy fine if it is the same error. I now have a cache drive and never see the problem so I do not have any idea if it got worse or better with any of the rc releases. Link to comment
WeeboTech Posted July 10, 2013 Share Posted July 10, 2013 Is this issue related to release 16c or is it always present. Is there a way to reproduce it at will? Link to comment
Frank1940 Posted July 10, 2013 Share Posted July 10, 2013 Have you set a up 'Minimum Free space' parameter as outlined in the following link? http://lime-technology.com/wiki/index.php/Un-Official_UnRAID_Manual#Min._Free_Space Link to comment
boof Posted July 10, 2013 Share Posted July 10, 2013 I had this sort of problem when trying to setup large truecrypt volumes on a disk via samba. (i.e 00's of gigabytes in size). That would have been way back in 4.x though. So if this is a similar thing (and it sounds like it) then it's probably been around for a while. I can't remember how I worked round it. I think I just kept trying until it managed to eventually manage to do it. Not much use for diagnostics. Link to comment
WeeboTech Posted July 10, 2013 Share Posted July 10, 2013 Just to add details. In my experience, (only mine) It ALWAYS happened on 5400 RPM hard drives that were nearly full with a huge amount of files no matter how much ram I had in the system. 1GB, 2GB, 4GB or 8GB. The server always showed a huge amount of activity on the drive, like it was searching for something before it was going to be written. Even if I did a find down the whole drive before I was going to write to it, the samba write timeout problem still happened. I knew no data was moving across the network as I was using teracopy. Link to comment
hawihoney Posted July 10, 2013 Author Share Posted July 10, 2013 Thanks for your answers. All drives are pretty full, but I did not see these problems with 16b. The share in question spans 12 of the 14 data drives. allocation_method=highwater with min_free_space=0 and split_level=999. It's no show stopper for me as I can preproduce the huge files on the local machines and copy hem over to the tower --> this always works. Sometimes writing to an individual disk instead of the share helps - but not always. I don't have a cache drive. If it happens next time I will try the empty_folder_trick and report back. ulimit -a root@Tower2:~# ulimit -a core file size (blocks, -c) unlimited data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 31851 max locked memory (kbytes, -l) unlimited max memory size (kbytes, -m) unlimited open files (-n) 40960 pipe size (512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) unlimited cpu time (seconds, -t) unlimited max user processes (-u) 31851 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited df root@Tower2:~# df Filesystem 1K-blocks Used Available Use% Mounted on tmpfs 131072 188 130884 1% /var/log /dev/sda 990960 40544 950416 5% /boot /dev/md1 2930177100 2848898612 81278488 98% /mnt/disk1 /dev/md2 2930177100 2876879764 53297336 99% /mnt/disk2 /dev/md3 2930177100 2857527848 72649252 98% /mnt/disk3 /dev/md4 2930177100 2847528452 82648648 98% /mnt/disk4 /dev/md5 2930177100 2850985712 79191388 98% /mnt/disk5 /dev/md6 2930177100 2848203856 81973244 98% /mnt/disk6 /dev/md7 2930177100 2854532148 75644952 98% /mnt/disk7 /dev/md8 1465093832 1385516916 79576916 95% /mnt/disk8 /dev/md9 1465093832 1375018804 90075028 94% /mnt/disk9 /dev/md10 1465093832 1388977328 76116504 95% /mnt/disk10 /dev/md11 1465093832 1311724844 153368988 90% /mnt/disk11 /dev/md12 2930177100 2472392152 457784948 85% /mnt/disk12 /dev/md13 2930177100 2761381736 168795364 95% /mnt/disk13 /dev/md14 2930177100 2758653556 171523544 95% /mnt/disk14 shfs 35162146328 33438221864 1723924464 96% /mnt/user Thanks Link to comment
Frank1940 Posted July 10, 2013 Share Posted July 10, 2013 Thanks for your answers. All drives are pretty full, but I did not see these problems with 16b. The share in question spans 12 of the 14 data drives. allocation_method=highwater with min_free_space=0 and split_level=999. Thanks Did you read that min_free_space SHOULD BE equal to approximately twice the size of the largest file to be written to the share? If that is not the source of your problem, it soon will be! Link to comment
limetech Posted July 10, 2013 Share Posted July 10, 2013 Thanks for your answers. All drives are pretty full, That is entirely the issue. You need a bigger server man, I can recommend one.. but I did not see these problems with 16b. The share in question spans 12 of the 14 data drives. allocation_method=highwater with min_free_space=0 Coincidence. You were probably only slightly less than nearly full then. and split_level=999. Leaving the 'split level' field blank accomplishes the same thing as setting it to a very high number. Link to comment
hawihoney Posted July 10, 2013 Author Share Posted July 10, 2013 That is entirely the issue. You need a bigger server man, I can recommend one.. This does mean I do have 2TB free on a server and can't write 40GB to it? And it does mean I do have at least 80GB free on a single drive on a server and can't write 40GB to it? Time for a new filesystem instead of new hardware ;-) Link to comment
limetech Posted July 10, 2013 Share Posted July 10, 2013 This does mean I do have 2TB free on a server and can't write 40GB to it? And it does mean I do have at least 80GB free on a single drive on a server and can't write 40GB to it? It means you are at nearly 98% full. You might squeeze a little more onto disk11 and disk 12. Time for a new filesystem instead of new hardware ;-) Yes it turns out reiserfs is notorious for slowing way down the closer it gets to 100% full. But you will find most other file systems do as well. Sorry, there's not a lot I can do about it right now. Link to comment
mejutty Posted July 10, 2013 Share Posted July 10, 2013 I haven't tested this particular issue out for a while since I installed a cache drive, but I saw it and my disks were only 50% full it also still happened when the drive that was being written to was empty. Might disable the cache drive on the share and see what happens. Link to comment
WeeboTech Posted July 10, 2013 Share Posted July 10, 2013 Can you try making a New Folder first before you start the 40GB copy? I can say from experience, I have seen the same issues. if you have the time or ability. Do the copy near your machine while it's idle and freshly booted. Start your copy. See if the first allocation of the file only utilizes the drive in question. It should be seeking furiously. Either the copy will time out, or you will see it go very slowly utilizing the data and parity drive. In my case, I was using disk shares and I would see this behavior. I've never used the user shares so I cannot comment there. Link to comment
WeeboTech Posted July 10, 2013 Share Posted July 10, 2013 I haven't tested this particular issue out for a while since I installed a cache drive, but I saw it and my disks were only 50% full it also still happened when the drive that was being written to was empty. Might disable the cache drive on the share and see what happens. To a 5400 RPM drive or a 7200 RPM drive? I noticed the behavior was more prominent on the slower drives near high capacity. I wonder if there's a samba time out value that can be adjusted? Link to comment
hawihoney Posted July 11, 2013 Author Share Posted July 11, 2013 The empty folder trick works for me, incredible. What I did is: I tried to create a big file in one of my shared folders (287 files, the directory itself is the direct child of the shared folder, no additional subdirectories below). This was aborted with an unknown network error twice. I did create an empty folder 'Temp' in the root of this shared folder and I could create the big file successfully. I could confirm that twice today. P.S.: Most of my drives are 7200rpm. Link to comment
garycase Posted July 11, 2013 Share Posted July 11, 2013 This is an interesting issue ... but I don't think it's strictly due to copying large files. Just to confirm, I just copied two very large image files to my RC16c server ... one was 99GB, the other 65GB ... and they're both nested several directories into the share [//Tower/Backups/Images of my systems/PCName/Date of Image/<file goes here>] Worked perfectly ... and I just ran a validation to confirm they were indeed good copies. I noted that you also can do the same -- you indicated if you created the large files on another PC, they would copy fine. So ... the question is just what is ImgBurn doing that doesn't "play right" with UnRAID. Just for grins, when I get a chance I'll create an image file directly to the UnRAID server (using Image for Windows) and see if it also works okay. However ... I think it's very likely that your issue is simply that your server is so full that the initial file write by ImgBurn is written okay on one drive, but then it attempts to expand that file and the drive is too full for that. I don't believe UnRAID will span files across drives (Tom, WeeboTech, etc. can confirm that). That's why if you copy the large file by itself it works ... because the size is "known" when the copy starts, so it's allocated on a drive that has enough room. A simple test of that possibility: Choose a drive that has plenty of extra space; and then create an image to a folder on that drive => nest the folder as deep as you typically would, as I don't think this has anything to do with the depth of the folder. As for speed ... writes to very full drives definitely slow WAY down => in my experience it really doesn't matter if they're 5400, 5900, or 7200 rpm drives (Although obviously the faster drive, which also has notably faster seek times, will finish the "where to write" process quicker). Link to comment
hawihoney Posted July 11, 2013 Author Share Posted July 11, 2013 I think it's very likely that your issue is simply that your server is so full that the initial file write by ImgBurn is written okay on one drive, but then it attempts to expand that file and the drive is too full for that. I don't think that this is the reason. There's no drive expansion during the copy because there's enough room on the drive that was selected by unRAID for that particular file. And I can recreate that issue even when using a single drive - instead of a shared folder. But I'm still impressed. This morning I had to create a lot of huge images on my two LimeTech towers and when using an empty folder it always works. When using my usual folders it always fails (100% ok vs. 100% fail on huge files). What does ImgBurn? There's an option to preallocate files. This is turned on per default. I will switch that option off and look what happens. Here's a quote from their manual: Allocate Files On Creation: Files created in 'Read/Build Mode' will be preallocated. This cuts down on fragmentation. I know that this issue has to do with nearly full arrays. But I want to fill my array at least to its max and don't waste 2TB per machine (=4TB) because of the nature of unRAID/ReiserFS. Link to comment
garycase Posted July 11, 2013 Share Posted July 11, 2013 FWIW I just finished doing an image directly to the server from Image for Windows on one of my other machines. 54GB file ... sent directly to a well-nested folder, and it worked perfectly. However, none of my drives are close to full on my backup server (at least 1TB free on all of them). My Media server is a different story ... several of those drives are VERY full [in a couple cases only 20MB or so of free space]. But I wrote directly to those drives to fill the last bit up ... and they're all static data (DVDs), so once they were full, they're never modified again. The last few writes to the drives were quite slow. One other thought: Have you had this problem with Windows 7? Link to comment
hawihoney Posted July 11, 2013 Author Share Posted July 11, 2013 One other thought: Have you had this problem with Windows 7? Yes. This is not the first time that my unRAID machine becomes full. This was something that did bite me in the past as well. I'm a little bit further now: With this ImgBurn option "Settings/General/Page 1/AllocateFilesOnCreation=off" everything works. Do I need to worry about fragmentation now ;-) Link to comment
garycase Posted July 11, 2013 Share Posted July 11, 2013 With this ImgBurn option "Settings/General/Page 1/AllocateFilesOnCreation=off" everything works. Do I need to worry about fragmentation now ;-) Good ... glad it's working. No, I wouldn't worry about fragmentation. My understanding is that Reiser does a much better job of minimizing this ... I simply don't think it's an issue with UnRAID. Link to comment
WeeboTech Posted July 11, 2013 Share Posted July 11, 2013 I'm a little bit further now: With this ImgBurn option "Settings/General/Page 1/AllocateFilesOnCreation=off" everything works. Do I need to worry about fragmentation now ;-) Only if you have multiple writes from multiple processes simultaneously. What I've noticed in the past. Drives that have not been allocated any new files for a while (not sure what the time frame is) will reveal this initial allocation delay if they are near full with allot of files and are slower drives. Once you make the new 'initial' directory and/or file, all the necessary file system structures are now in ram, so any network access that allocates a new file can do so at a faster speed. This is why I asked if you could observe the lights on your drives. It's been a known issue that nearly full drives get slower allocating new files and/or directories, at least for reiserfs. I've not experienced it the same way with ext3 or veritas filesystem. With reiserfs, once that new directory or file is opened and ready for writing, the next operations go faster. Over time the filesystem structures get flushed out (or you reboot) and the process occurs again. This was a major reason for me to resort to a cache drive. I could write at optimal speeds then move the files manually in place via rsync. Link to comment
garycase Posted July 11, 2013 Share Posted July 11, 2013 This was a major reason for me to resort to a cache drive. I could write at optimal speeds then move the files manually in place via rsync. r.e. "move the files manually in place via rsync" ==> Is this the UnRAID "mover" ... or do you do this independently? ... and if the latter, Why? Link to comment
Recommended Posts