Roscoe62 Posted January 1, 2022 Share Posted January 1, 2022 Currently running UnRaid OS v 6.3.5. Just tried transferring a movie file across to the UnRaid server. It works file for a time - transferring at approx 113 MB/s, then just after around 10% of a 25Gb file, the transfer speed drops to 2MB/s (according to Explorer). After about a minute or so, the speed ramps back up to what it was before, and then it repeats - slows right down....almost like it's paused, before ramping up again. I've had the file transfer time-out several times. I've NEVER experienced this before. I don't have a cache drive. I also used TeraCopy to do the file transfers, but - for whatever reason - I seem to experience more 'time-outs' using that tool. Do I have an issue on my UnRaid server, is there a disk issue, or should I investigate further - if so, what should I do? Quote Link to comment
Squid Posted January 1, 2022 Share Posted January 1, 2022 Is this transferring to the array or to the cache? The way the system works (and it's most noticeable when going to the array) is that transfers get cached in RAM and slowly written to the drive. Eventually you get to the point where RAM is full and the system is waiting to clear it out a bit by actually writing it to the drive. Generally, to the array, on average you're looking at roughly say 60MB/s sustained. The "pauses" aren't particularly important as what you're basically looking at is the average speed for the entire transfer or the total time it takes). You can speed up this by enabling reconstruct write in Settings - Disk Settings which speeds up writes at the expense of power (all drives have to be spinning) Directly to a cache drive (particularly SSD) these dips are massively diminished due to the faster writes to it Quote Link to comment
Roscoe62 Posted January 1, 2022 Author Share Posted January 1, 2022 I don't have a cache drive, but maybe I should look at one. Are you saying that the transfers slow down (when NOT writing to cache) when the server is running out of memory? Secondly, if I have a cache drive, if I want to copy a file to a certain location, when the file is transferred (from, say, my Windows PC) to the server, it goes to the cache drive first, and is then copied to the desired directory on the array? If this is so, when is it transferred? How long after the transfer to the cache drive is finished? If it's easier, is there somewhere this is documented on the lime website? Many thanks for your help and direction. Quote Link to comment
Squid Posted January 1, 2022 Share Posted January 1, 2022 No. What I'm saying is that on average writes to the array average around 50-60MB/s (YMMV), but the OS caches transfers in RAM (hence the 113MB/s -> which is the line speed of 1G network) Eventually the RAM cache gets filled and the system has to slow down transfers because it needs to flush to the hard drive. This results in the dips you're seeing because it can't sustain 113MB/s transfer rate since no matter how you cut it the array can't keep up in transfer speeds. You're not particularly running out of memory. If the system didn't cache the writes at all, you'd see a solid ~60MB/s transfer rate. Since it cache's at the end of the transfer, the average rate over the time frame is going to be ~60MB/s but with the peaks and the dips reflecting the caching. 10 hours ago, Roscoe62 said: If this is so, when is it transferred? How long after the transfer to the cache drive is finished? At the schedule you set. Usually during off hours (default is 4am daily) Quote Link to comment
Hoopster Posted January 1, 2022 Share Posted January 1, 2022 12 hours ago, Roscoe62 said: Secondly, if I have a cache drive, if I want to copy a file to a certain location, when the file is transferred As @Squid said the Mover (the process that moves files from cache to the array) runs on a schedule set by you. However, you do not determine the exact location on the array for the files moved by the Mover. The use of the cache drive for caching writes to the array in enabled at the unRAID share level. If your shares happen to contain multiple disks, the files are written to the disk currently being used by the share according to your allocation method and minimum free space settings. Quote Link to comment
Roscoe62 Posted January 3, 2022 Author Share Posted January 3, 2022 Thanks! I probably didn't use the correct terminology. When doing a transfer to the array, I need the file to be transferred to a directory I've set up on the share (which, most likely, DOES span multiple physical disks), but I don't really care which physical disk the file ends up on - I'm happy for UnRaid to manage that. Along with the Mover process moving files at the time I set, is it possible to set it up to move the file as soon as the transfer to the cache drive has completed? Or, alternatively, is it possible to over-ride the default move-time set? The reason I'm asking is - I'll often pull in TV episodes, transfer them to the array, with the intention of watching the episodes within, say, the next hour or so, so having the Mover move the files at a low-traffic time doesn't really suit my use case. Quote Link to comment
itimpi Posted January 3, 2022 Share Posted January 3, 2022 As far as using files is concerned it is transparent as to whether files for a User Share are on the main array or on a cache pool if you are accessing them via the User Share system. Unraid User Shares provide a consolidated view of files in a user share regardless of whether they are currently resident on the main array or a cache pool associated with the share. For the Use Case you mention you do not need to know if the files are currently on the main array or a cache pool. You can always run mover,manually but that does not seem to be needed for the Use case you mention. you can also,by-pass the User Share system for placing files by transferring files directly to a specific drive, and they will still appear under the appropriate User share as long as you put them into an appropriate directory. Quote Link to comment
Roscoe62 Posted January 3, 2022 Author Share Posted January 3, 2022 Thanks for the response. I didn't know that, for accessing the files, it's transparent to the user where the files currently are (on the cache drive, or on the array). That's a really nice system feature! I can now see a LOT of value in adding an SSD as a cache drive. Thanks for educating me, much appreciated! Quote Link to comment
Roscoe62 Posted January 3, 2022 Author Share Posted January 3, 2022 Hmmm....perhaps I understated the original issue I'm experiencing. I went back and reviewed the wording in my original post. What I DIDN'T say was the transfers are consistently timing out. I understand, from the responses given earlier, that the transfer speed will fluctuate - and that's fair enough, BUT the transfers are often FAILING....giving a message similar to this....(see attached). I followed your advice, went into the UnRaid Disk Settings and enabled 'reconstruct write'. However, even after doing that, the transfers are still failing....timing out? The transfer speed sometimes drops down to zero, and not long after that I'll get the error message. In the past, file transfers - especially of large files - has been a little slow, but they've almost never failed before....and this is happening consistently now. I'm happy to investigate installing an SSD as a cache drive, but does this current behavior point to a different issue? Quote Link to comment
itimpi Posted January 3, 2022 Share Posted January 3, 2022 No idea exactly what is causing your problem. the only time I got similar symptoms it turned out in the end to be the network on the Windows client that was failing under heavy load. Quote Link to comment
Roscoe62 Posted January 4, 2022 Author Share Posted January 4, 2022 Don't know if it's helpful/useful, but here's a snapshot of the 'stats' during the transfer.... Quote Link to comment
Roscoe62 Posted January 4, 2022 Author Share Posted January 4, 2022 ...and I should have specified....the transfer failed again. Quote Link to comment
Roscoe62 Posted January 4, 2022 Author Share Posted January 4, 2022 Yes, you're right. Diagnostics attached - created the diagnostics file just after attempting (and failing) to transfer the file this morning..... server-diagnostics-20220105-0845.zip Quote Link to comment
Roscoe62 Posted January 5, 2022 Author Share Posted January 5, 2022 I took a look at the syslog file under the logs directory. I found the first occurrence of something that looks like an issue matching mine (I really don't know much about this stuff, so I'm just guessing) the incident happened yesterday, and the times the same, or similar incident happens likely lined up with the times I attempted to transfer the file, and failed. The last time was 8:45AM this morning. Here are the 'selected highlights'. To me (again, I'm not an expert so could still be miles off track) it looks like some kind of memory issue, but it would be GREAT to get some expert oversight to confirm or deny, and to (hopefully) advise next steps.... FIRST TIME OBSERVED ISSUE.... Jan 4 09:38:47 Server kernel: cpuload invoked oom-killer: gfp_mask=0x27080c0(GFP_KERNEL_ACCOUNT|__GFP_ZERO|__GFP_NOTRACK), nodemask=0, order=0, oom_score_adj=0 Jan 4 09:38:47 Server kernel: cpuload cpuset=/ mems_allowed=0 Jan 4 09:38:47 Server kernel: CPU: 0 PID: 10375 Comm: cpuload Not tainted 4.9.30-unRAID #1 Jan 4 09:38:47 Server kernel: Hardware name: Supermicro Super Server/X10SRi-F, BIOS 2.0b 05/02/2017 Jan 4 09:38:47 Server kernel: ffffc9002039fb80 ffffffff813a4a1b ffffc9002039fd68 ffffffff8193d255 Jan 4 09:38:47 Server kernel: ffffc9002039fc00 ffffffff8111ef44 0000000000000000 0000000000000000 Jan 4 09:38:47 Server kernel: ffffffff810b2131 0000000000000000 ffffc9002039fbd8 ffffffff8105407d Jan 4 09:38:47 Server kernel: Call Trace: Jan 4 09:38:47 Server kernel: [<ffffffff813a4a1b>] dump_stack+0x61/0x7e Jan 4 09:38:47 Server kernel: [<ffffffff8111ef44>] dump_header+0x76/0x20e Jan 4 09:38:47 Server kernel: [<ffffffff810b2131>] ? delayacct_end+0x51/0x5a Jan 4 09:38:47 Server kernel: [<ffffffff8105407d>] ? has_ns_capability_noaudit+0x34/0x3e Jan 4 09:38:47 Server kernel: [<ffffffff810c7ce6>] oom_kill_process+0x81/0x377 Jan 4 09:38:47 Server kernel: [<ffffffff810c84af>] out_of_memory+0x3aa/0x3e5 Jan 4 09:38:47 Server kernel: [<ffffffff810cc181>] __alloc_pages_nodemask+0xb5b/0xc71 Jan 4 09:38:47 Server kernel: [<ffffffff81102d82>] alloc_pages_current+0xbe/0xe8 Jan 4 09:38:47 Server kernel: [<ffffffff81046421>] pte_alloc_one+0x12/0x35 Jan 4 09:38:47 Server kernel: [<ffffffff810ee00a>] handle_mm_fault+0x766/0xf96 Jan 4 09:38:47 Server kernel: [<ffffffff81042252>] __do_page_fault+0x24a/0x3ed Jan 4 09:38:47 Server kernel: [<ffffffff81042438>] do_page_fault+0x22/0x27 Jan 4 09:38:47 Server kernel: [<ffffffff81680f18>] page_fault+0x28/0x30 Jan 4 09:38:47 Server kernel: Mem-Info: Jan 4 09:38:47 Server kernel: active_anon:175370 inactive_anon:4469 isolated_anon:0 Jan 4 09:38:47 Server kernel: active_file:3368827 inactive_file:332778 isolated_file:160 Jan 4 09:38:47 Server kernel: unevictable:0 dirty:331628 writeback:1282 unstable:0 Jan 4 09:38:47 Server kernel: slab_reclaimable:111294 slab_unreclaimable:13733 Jan 4 09:38:47 Server kernel: mapped:17645 shmem:162919 pagetables:2374 bounce:0 Jan 4 09:38:47 Server kernel: free:51705 free_pcp:398 free_cma:0 Jan 4 09:38:47 Server kernel: Node 0 active_anon:701480kB inactive_anon:17876kB active_file:13475308kB inactive_file:1331112kB unevictable:0kB isolated(anon):0kB isolated(file):640kB mapped:70580kB dirty:1326512kB writeback:5128kB shmem:651676kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 6144kB writeback_tmp:0kB unstable:0kB pages_scanned:22529989 all_unreclaimable? yes Jan 4 09:38:47 Server kernel: Node 0 DMA free:15884kB min:132kB low:164kB high:196kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15968kB managed:15884kB mlocked:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB Jan 4 17:48:52 Server kernel: Out of memory: Kill process 17308 (smbd) score 1 or sacrifice child Jan 4 17:48:52 Server kernel: Killed process 17308 (smbd) total-vm:402732kB, anon-rss:11096kB, file-rss:4kB, shmem-rss:15988kB Jan 4 17:48:52 Server kernel: oom_reaper: reaped process 17308 (smbd), now anon-rss:0kB, file-rss:0kB, shmem-rss:3324kB Jan 4 17:54:22 Server kernel: cpuload invoked oom-killer: gfp_mask=0x27002c2(GFP_KERNEL_ACCOUNT|__GFP_HIGHMEM|__GFP_NOWARN|__GFP_NOTRACK), nodemask=0, order=0, oom_score_adj=0 Jan 4 17:54:22 Server kernel: Out of memory: Kill process 20368 (smbd) score 1 or sacrifice child Jan 4 17:54:22 Server kernel: Killed process 20368 (smbd) total-vm:400552kB, anon-rss:11024kB, file-rss:4kB, shmem-rss:15920kB Jan 4 17:54:22 Server kernel: oom_reaper: reaped process 20368 (smbd), now anon-rss:0kB, file-rss:0kB, shmem-rss:3328kB Jan 4 17:57:13 Server kernel: awk invoked oom-killer: gfp_mask=0x24280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), nodemask=0, order=0, oom_score_adj=0 Jan 4 17:57:13 Server kernel: awk cpuset=/ mems_allowed=0 Jan 4 17:57:13 Server kernel: CPU: 5 PID: 24524 Comm: awk Not tainted 4.9.30-unRAID #1 Jan 4 17:57:13 Server kernel: Out of memory: Kill process 23099 (smbd) score 1 or sacrifice child Jan 4 17:57:13 Server kernel: Killed process 23099 (smbd) total-vm:400552kB, anon-rss:10868kB, file-rss:4kB, shmem-rss:15916kB Jan 4 17:57:13 Server kernel: oom_reaper: reaped process 23099 (smbd), now anon-rss:32kB, file-rss:0kB, shmem-rss:3324kB Jan 4 18:01:01 Server kernel: notify invoked oom-killer: gfp_mask=0x24280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), nodemask=0, order=0, oom_score_adj=0 Jan 4 18:01:01 Server kernel: notify cpuset=/ mems_allowed=0 Jan 4 18:01:01 Server kernel: CPU: 3 PID: 26616 Comm: notify Not tainted 4.9.30-unRAID #1 Jan 4 18:01:01 Server kernel: Out of memory: Kill process 24534 (smbd) score 1 or sacrifice child Jan 4 18:01:01 Server kernel: Killed process 24534 (smbd) total-vm:423376kB, anon-rss:11248kB, file-rss:4kB, shmem-rss:15980kB Jan 4 18:01:01 Server kernel: oom_reaper: reaped process 24534 (smbd), now anon-rss:0kB, file-rss:0kB, shmem-rss:3324kB Jan 4 20:58:20 Server kernel: mdcmd (677): spindown 2 Jan 4 20:58:21 Server kernel: mdcmd (678): spindown 8 Jan 4 20:58:45 Server kernel: mdcmd (679): spindown 1 Jan 4 20:58:47 Server kernel: mdcmd (680): spindown 7 Jan 4 21:01:59 Server kernel: mdcmd (681): spindown 0 Jan 4 21:03:09 Server kernel: mdcmd (682): spindown 3 Jan 4 21:03:10 Server kernel: mdcmd (683): spindown 5 Jan 4 21:29:09 Server kernel: mdcmd (684): spindown 4 Jan 5 00:32:14 Server kernel: mdcmd (685): spindown 6 Jan 5 08:42:32 Server emhttp: Spinning up all drives... Jan 5 08:42:32 Server kernel: mdcmd (686): spinup 0 Jan 5 08:42:32 Server kernel: mdcmd (687): spinup 1 Jan 5 08:42:32 Server kernel: mdcmd (688): spinup 2 Jan 5 08:42:32 Server kernel: mdcmd (689): spinup 3 Jan 5 08:42:32 Server kernel: mdcmd (690): spinup 4 Jan 5 08:42:32 Server kernel: mdcmd (691): spinup 5 Jan 5 08:42:32 Server kernel: mdcmd (692): spinup 6 Jan 5 08:42:32 Server kernel: mdcmd (693): spinup 7 Jan 5 08:42:32 Server kernel: mdcmd (694): spinup 8 Jan 5 08:45:27 Server kernel: cpuload invoked oom-killer: gfp_mask=0x24200ca(GFP_HIGHUSER_MOVABLE), nodemask=0, order=0, oom_score_adj=0 Jan 5 08:45:27 Server kernel: cpuload cpuset=/ mems_allowed=0 Jan 5 08:45:27 Server kernel: CPU: 7 PID: 28770 Comm: cpuload Not tainted 4.9.30-unRAID #1 Quote Link to comment
trurl Posted January 5, 2022 Share Posted January 5, 2022 17 hours ago, Roscoe62 said: Jan 4 17:48:52 Server kernel: Out of memory: Kill process 17308 (smbd) score 1 or sacrifice child This indicates smbd was oomed (process killed because out-of-memory) Do you have any VMs? How many dockers? Quote Link to comment
Roscoe62 Posted January 6, 2022 Author Share Posted January 6, 2022 No, no VMs, and only 2 dockers - neither are running - don't use them any more. Quote Link to comment
ChatNoir Posted January 6, 2022 Share Posted January 6, 2022 On 1/1/2022 at 2:26 AM, Roscoe62 said: Currently running UnRaid OS v 6.3.5. Any reason for not updating to a newer version ? It might be worth to try (backup your flash drive first). Quote Link to comment
Roscoe62 Posted January 6, 2022 Author Share Posted January 6, 2022 No, there's no reason for not updating to a new version. I will give it a try. However, I'm a little concerned just in case it's a hardware issue, in which case doing the upgrade won't help. Maybe before backing up my flash drive and updating, I'll reboot the server and use the memtest tool to see if there's an issue there. it certainly can't hurt.... Quote Link to comment
ChatNoir Posted January 6, 2022 Share Posted January 6, 2022 1 hour ago, Roscoe62 said: Maybe before backing up my flash drive and updating, I'll reboot the server and use the memtest tool to see if there's an issue there. it certainly can't hurt.... All good ideas. Quote Link to comment
Roscoe62 Posted January 7, 2022 Author Share Posted January 7, 2022 Memory passed memtest without any issues. Successfully backed up UnRaid v6.3.5. Updated to v6.9.2. I then tried the transfer....and it worked. Still large fluctuations with transfer speed, but someone already explained why that was the case. Will keep an eye on it over the next few large transfers I do... Thanks everyone for the support & feedback. 1 Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.