Mover Freezes


Recommended Posts

The problem is this:

 

Quote

Apr  7 03:40:04 GrimLock root: move: file /mnt/cache/ipcams/events/1/18/03/08/04/47/00/.6369
Apr  7 03:40:04 GrimLock root: move: create_parent: /mnt/cache/ipcams error: No space left on device
Apr  7 03:40:04 GrimLock root: move: file /mnt/cache/ipcams/events/1/18/03/08/04/47/00/00001-capture.jpg
Apr  7 03:40:04 GrimLock root: move: create_parent: /mnt/cache/ipcams error: No space left on device
Apr  7 03:40:04 GrimLock root: move: file /mnt/cache/ipcams/events/1/18/03/08/04/47/00/00002-capture.jpg
Apr  7 03:40:04 GrimLock root: move: create_parent: /mnt/cache/ipcams error: No space left on device

 

and many thousands more. But what is the cause? Because of the anonymization of your diagnostics there are two shares called i--s (one of which must be 'isos' and the other 'ipcams') and it looks as though the ipcams one is set to use only disk 9 and the cache. However there is plenty of free space on disk 9.

 

Quote

/dev/md9        932G   85G  847G  10% /mnt/disk9

 

I notice, about an hour after start up and well before the mover starts, that there's a kernel oops and stack trace:

 

Quote

Apr  6 21:26:05 GrimLock kernel: BUG: unable to handle kernel NULL pointer dereference at 0000000000000038
Apr  6 21:26:05 GrimLock kernel: IP: tcp_push+0x4e/0xee
Apr  6 21:26:05 GrimLock kernel: PGD 800000060011c067 P4D 800000060011c067 PUD 60011d067 PMD 0 
Apr  6 21:26:05 GrimLock kernel: Oops: 0002 [#1] PREEMPT SMP PTI
Apr  6 21:26:05 GrimLock kernel: Modules linked in: xt_CHECKSUM iptable_mangle ipt_REJECT nf_reject_ipv4 ebtable_filter ebtables ip6table_filter ip6_tables vhost_net tun vhost tap veth xt_nat ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 iptable_filter ip_tables nf_nat xfs md_mod igb ptp pps_core i2c_algo_bit intel_powerclamp coretemp kvm crc32c_intel mpt3sas intel_cstate intel_uncore raid_class scsi_transport_sas ipmi_si pata_jmicron i2c_i801 ata_piix i5500_temp i7core_edac i2c_core button [last unloaded: pps_core]
Apr  6 21:26:05 GrimLock kernel: CPU: 7 PID: 29642 Comm: java Not tainted 4.14.26-unRAID #1
Apr  6 21:26:05 GrimLock kernel: Hardware name: Supermicro X8DTN/X8DTN, BIOS 2.1c       10/28/2011
Apr  6 21:26:05 GrimLock kernel: task: ffff88060001b300 task.stack: ffffc9000c418000
Apr  6 21:26:05 GrimLock kernel: RIP: 0010:tcp_push+0x4e/0xee
Apr  6 21:26:05 GrimLock kernel: RSP: 0018:ffffc9000c41bd60 EFLAGS: 00010246
Apr  6 21:26:05 GrimLock kernel: RAX: 0000000000000000 RBX: 00000000000005a8 RCX: 0000000000000001
Apr  6 21:26:05 GrimLock kernel: RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff8805f4159100
Apr  6 21:26:05 GrimLock kernel: RBP: ffff88060001bc20 R08: 000000000000fe88 R09: 0000000000000000
Apr  6 21:26:05 GrimLock kernel: R10: ffff8805f4159258 R11: 0000000000000000 R12: ffff8805f4159100
Apr  6 21:26:05 GrimLock kernel: R13: 0000000000000000 R14: ffff880c0d84a800 R15: 00000000ffffffe0
Apr  6 21:26:05 GrimLock kernel: FS:  0000148093efe700(0000) GS:ffff880c3fac0000(0000) knlGS:0000000000000000
Apr  6 21:26:05 GrimLock kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Apr  6 21:26:05 GrimLock kernel: CR2: 0000000000000038 CR3: 00000006001da003 CR4: 00000000000206e0
Apr  6 21:26:05 GrimLock kernel: Call Trace:
Apr  6 21:26:05 GrimLock kernel: tcp_sendmsg_locked+0xa53/0xbac
Apr  6 21:26:05 GrimLock kernel: tcp_sendmsg+0x23/0x35
Apr  6 21:26:05 GrimLock kernel: sock_sendmsg+0x14/0x1e
Apr  6 21:26:05 GrimLock kernel: SyS_sendto+0xc0/0xe7
Apr  6 21:26:05 GrimLock kernel: ? vfs_read+0xf3/0x11f
Apr  6 21:26:05 GrimLock kernel: ? SyS_read+0x75/0x81
Apr  6 21:26:05 GrimLock kernel: do_syscall_64+0xfe/0x107
Apr  6 21:26:05 GrimLock kernel: entry_SYSCALL_64_after_hwframe+0x3d/0xa2
Apr  6 21:26:05 GrimLock kernel: RIP: 0033:0x1480cf83095b
Apr  6 21:26:05 GrimLock kernel: RSP: 002b:0000148093efad50 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
Apr  6 21:26:05 GrimLock kernel: RAX: ffffffffffffffda RBX: 000000000000002c RCX: 00001480cf83095b
Apr  6 21:26:05 GrimLock kernel: RDX: 0000000000002800 RSI: 0000147ff0006820 RDI: 000000000000002b
Apr  6 21:26:05 GrimLock kernel: RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
Apr  6 21:26:05 GrimLock kernel: R10: 0000000000000000 R11: 0000000000000246 R12: 00001480300189d8
Apr  6 21:26:05 GrimLock kernel: R13: 0000000000000000 R14: 0000148093efe660 R15: 0000148093efadc0
Apr  6 21:26:05 GrimLock kernel: Code: d0 75 02 31 c0 41 89 f3 41 81 e3 00 80 00 00 74 1a 44 8b 8f 68 05 00 00 41 d1 e9 44 2b 8f 6c 06 00 00 44 03 8f 74 06 00 00 79 10 <80> 48 38 08 8b 8f 6c 06 00 00 89 8f 74 06 00 00 40 80 e6 01 74 
Apr  6 21:26:05 GrimLock kernel: RIP: tcp_push+0x4e/0xee RSP: ffffc9000c41bd60
Apr  6 21:26:05 GrimLock kernel: CR2: 0000000000000038
Apr  6 21:26:05 GrimLock kernel: ---[ end trace 1adddb751bf7b971 ]---

 

which looks like a kernel bug. An oops is like a panic, but less severe and the kernel believes it can continue to run. You need to reboot. I'd check the file system on disk 9 just to make sure that it hasn't been corrupted. Consider updating to 6.5.1-rc4, which has a newer kernel.

Link to comment
1 hour ago, thegizzard said:

btw.. ipcams is a dedicated (disk9) where the ipcam jpgs go... they are cached then moved daily.

 

Yes, I worked it out in the end. It's just that when the diagnostics are anonymized you only get to see the first and last letters of the share names and you have two that fit the pattern: ipcams and isos; both appear as "i--s". I'm not sure why share names are considered sensitive information - perhaps it's to hide people stash of p--n. Hopefully your problem was fixed by the newer kernel.

Link to comment
34 minutes ago, trurl said:

 

You can sometimes figure it out by the number of letters.


ipcams
i----s

isos
i--s

 

 

Yes. I was looking in system/vars.txt so as to determine which disks were involved in the user share. In that file they both appear as [i..s]. Maybe that representation could be improved?

    [i..s] => Array
        (
            [name] => i..s
            [nameOrig] => i..s
            [comment] => 
            [allocator] => highwater
            [splitLevel] => 
            [floor] => 0
            [include] => disk9
            [exclude] => 
            [useCache] => yes
            [cow] => auto
            [color] => yellow-on
            [free] => 510355828
            [size] => 0
            [luksStatus] => 0
        )

    [i..s] => Array
        (
            [name] => i..s
            [nameOrig] => i..s
            [comment] => I..s
            [allocator] => highwater
            [splitLevel] => 
            [floor] => 0
            [include] => disk1,disk2,disk4,disk5,disk6,disk7,disk8,disk10
            [exclude] => 
            [useCache] => no
            [cow] => auto
            [color] => green-on
            [free] => 6386212116
            [size] => 0
            [luksStatus] => 0
        )

 

Link to comment

It looks like these diagnostics actually spell out the system shares like isos in the shares folder. They keep making improvements to the diagnostics. I get frustrated now trying to look at some from previous versions that made you piece together from other things which disk# each SMART was for.

Link to comment

Ok.  

 

So I removed disk9 from the array.  i checked then checked and fixed the file system.  mover still reported no space left on device for disk9

 

I reformatted disk9.  I removed all of the ipcams files from the cache drive.  tried the mover again and i still get the same error.  Can you think of any reason why it thinks disk9 is full?  or maybe more importantly why mover cannot write to disk9?

 

updated diagnostics.

 

grimlock-diagnostics-20180410-1146.zip

Link to comment

It might be instructive to understand why it was excluded though. The default for that setting is nothing in either include or exclude, which means all disks included.

 

Include means ONLY these disks

Exclude means EXCEPT these disks

 

And no reason to ever put something in both.

 

thegizzard, why did you have this set to not include disk9?

Link to comment

It is always recommended that you leave those fields blanks unless you have a specific reason to not include all disks.    Even then I would have thought that including entries in the Excluded entry (you should never set both Include and `exclude) is normally a more efficient way to manage this.

Edited by itimpi
Link to comment

I intended disk9 to be a dedicated disk for ipcam jpgs.  this is a lot of small files and i didnt want to have the array spinning all disks for these files.  so i put it to only use disk9 and for the share to be one of the few that use the cache.

 

i imagine i could have done this without using the global settings, but thats where i screwed up.

 

Link to comment
4 minutes ago, thegizzard said:

I intended disk9 to be a dedicated disk for ipcam jpgs.  this is a lot of small files and i didnt want to have the array spinning all disks for these files.  so i put it to only use disk9 and for the share to be one of the few that use the cache.

 

i imagine i could have done this without using the global settings, but thats where i screwed up.

 

 

Disks that are not included in Global Share Settings are not part of User Shares. Which makes me wonder why mover would even try to target disk9. Does split level supersede even Global Share Settings?

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.