March 9, 201214 yr In an effort to clear all files off a red-balled disk I copied the last remaining files to the cache disk (to be copied back later). I unassigned the red-balled drive when the bad drive was empty and did the initconifg. I reassigned the drives and started the parity synch. This was all done last night. This morning I checked to how things were going and I looked at my cache drive and all the files I copied there were gone? I purposefully disabled the mover script and I see no evidence of mover activity in the syslog. I mounted the bad drive outside of the array and it *seems* like they got copied to the array. I did a directory diff of the bad drive to the array and all the files appear to be there. How could this happen? I used MC to copy the files yesterday from /mnt/disk4/TV to /mnt/cache/TV. And I saw the cache drive fill up! It was close to %50 used yesterday and now it's nearly empty?? How did they get off the cache drive? There are two other files on the cache drive that are still there from a different share (due to not enough space on the share) Could the process of rebuilding parity caused the the cache disk to be emptied? Did I have to reboot to have the mover script disabled? I see this in the syslog Mar 8 17:30:21 Tower emhttp: shcmd (381): crontab -c /etc/cron.d - <<< "# Generated mover schedule: 40 3 * * * /usr/local/sbin/mover |& logger" Mar 8 17:30:21 Tower emhttp: shcmd (382): /usr/local/sbin/emhttp_event disks_mounted Mar 8 17:30:21 Tower emhttp_event: disks_mounted Mar 8 17:30:21 Tower emhttp: shcmd (383): :>/etc/samba/smb-shares.conf Mar 8 17:30:21 Tower emhttp: Restart SMB... Mar 8 17:30:21 Tower emhttp: shcmd (384): killall -HUP smbd Mar 8 17:30:22 Tower emhttp: shcmd (385): ps axc | grep -q rpc.mountd Mar 8 17:30:22 Tower emhttp: _shcmd: shcmd (385): exit status: 1 Mar 8 17:30:22 Tower emhttp: shcmd (386): /usr/local/sbin/emhttp_event svcs_restarted Mar 8 17:30:22 Tower emhttp_event: svcs_restarted Mar 8 17:45:56 Tower emhttp: shcmd (387): /usr/sbin/hdparm -y /dev/sdf &> /dev/null Mar 8 17:54:11 Tower emhttp: shcmd (388): crontab -c /etc/cron.d - <<< "# Generated mover schedule: 40 3 * * * /usr/local/sbin/mover &> /dev/null" and today I see Mar 9 08:45:53 Tower emhttp: shcmd (63): crontab -c /etc/cron.d - <<< "# Generated mover schedule: 40 3 * * * /usr/local/sbin/mover $stuff$> /dev/null" (Other emhttp) Could the mover have run silently? I normally see stuff like this: Mar 8 03:40:01 Tower logger: mover started Mar 8 03:40:01 Tower logger: moving TV/ Mar 8 03:40:01 Tower logger: ./TV/ Mar 8 03:40:11 Tower logger: .d..t...... ./ Mar 8 03:40:28 Tower logger: .d..tpog... TV/ Mar 8 03:40:28 Tower logger: mover finished But there is nothing like that in syslog? Could the logger process have stopped and then I would have no evidence of the mover running? I want to try and understand what happened. I believe my files are there... but before I go testing and wiping the bad disk, I want to convince myself that the files did get transferred ok.. On a side note... and completely unrelated, I did get this repeated several times.. is this anything Mar 8 19:11:57 Tower unmenu[3557]: Mar 8 19:11:57 Tower unmenu[3557]: WARNING: GPT (GUID Partition Table) detected on '/dev/sda'! The util fdisk doesn't support GPT. Use GNU Parted. Mar 8 19:11:57 Tower unmenu[3557]: Mar 8 19:11:57 Tower unmenu[3557]: Mar 8 19:11:57 Tower unmenu[3557]: WARNING: GPT (GUID Partition Table) detected on '/dev/sdc'! The util fdisk doesn't support GPT. Use GNU Parted. Mar 8 19:11:57 Tower unmenu[3557]: Mar 8 19:12:37 Tower unmenu[3557]: And I also saw a process crash? Mar 9 08:39:45 Tower kernel: md2: stopping Mar 9 08:39:45 Tower kernel: md3: stopping Mar 9 08:39:45 Tower kernel: md5: stopping Mar 9 08:39:45 Tower kernel: md6: stopping Mar 9 08:39:45 Tower kernel: md7: stopping Mar 9 08:39:45 Tower kernel: md8: stopping Mar 9 08:39:45 Tower kernel: md: using 1536k window, over a total of 2930266532 blocks. Mar 9 08:39:45 Tower kernel: BUG: unable to handle kernel NULL pointer dereference at 00000040 Mar 9 08:39:45 Tower kernel: IP: [<c131ffdf>] _raw_spin_lock_irq+0x9/0x1a Mar 9 08:39:45 Tower kernel: *pdpt = 000000002f897001 *pde = 0000000000000000 Mar 9 08:39:45 Tower kernel: Oops: 0002 [#1] SMP Mar 9 08:39:45 Tower kernel: last sysfs file: /sys/devices/pci0000:00/0000:00:1d.7/usb2/2-6/2-6:1.0/host7/target7:0:0/7:0:0:0/block/sdb/stat Mar 9 08:39:45 Tower kernel: Modules linked in: ntfs md_mod xor pata_jmicron mvsas ahci libsas scsi_transport_sas jmicron r8169 i2c_i801 i2c_core libahci [last unloaded: md_mod] Mar 9 08:39:45 Tower kernel: Mar 9 08:39:45 Tower kernel: Pid: 3141, comm: mdrecoveryd Not tainted 2.6.37.6-unRAID #4 Gigabyte Technology Co., Ltd. EP43-UD3L/EP43-UD3L Mar 9 08:39:45 Tower kernel: EIP: 0060:[<c131ffdf>] EFLAGS: 00010093 CPU: 0 Mar 9 08:39:45 Tower kernel: EIP is at _raw_spin_lock_irq+0x9/0x1a Mar 9 08:39:45 Tower kernel: EAX: 00000040 EBX: 00000000 ECX: 00000000 EDX: 00000100 Mar 9 08:39:45 Tower kernel: ESI: 00000000 EDI: 00000000 EBP: f1973e18 ESP: f1973e18 Mar 9 08:39:45 Tower kernel: DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 Mar 9 08:39:45 Tower kernel: Process mdrecoveryd (pid: 3141, ti=f1972000 task=f18d3850 task.ti=f1972000) Mar 9 08:39:45 Tower kernel: Stack: Mar 9 08:39:45 Tower kernel: f1973e74 f8fc5833 00000004 00000000 f1a2ca80 f1973e48 c109603d f1973e48 Mar 9 08:39:45 Tower kernel: c109604e 00000040 00000000 00000000 f1973e68 c10960b2 00000fff 00000000 Mar 9 08:39:45 Tower kernel: 00000000 00000000 00000000 f6c49028 00000000 00000000 f6c49028 f1973e88 Mar 9 08:39:45 Tower kernel: Call Trace: Mar 9 08:39:45 Tower kernel: [<f8fc5833>] ? get_active_stripe+0x32/0x3a8 [md_mod] Mar 9 08:39:45 Tower kernel: [<c109603d>] ? vfs_fsync_range+0x3c/0x5e Mar 9 08:39:45 Tower kernel: [<c109604e>] ? vfs_fsync_range+0x4d/0x5e Mar 9 08:39:45 Tower kernel: [<c10960b2>] ? generic_write_sync+0x53/0x62 Mar 9 08:39:45 Tower kernel: [<f8fc5bb7>] ? unraid_sync+0xe/0x42 [md_mod] Mar 9 08:39:45 Tower kernel: [<f8fc1b48>] ? md_do_sync+0x14c/0x3b4 [md_mod] Mar 9 08:39:45 Tower kernel: [<f8fc20f1>] ? write_file+0xb1/0xdd [md_mod] Mar 9 08:39:45 Tower kernel: [<c107c256>] ? do_sync_write+0x0/0xc5 Mar 9 08:39:45 Tower kernel: [<f8fc22b7>] ? md_do_recovery+0x115/0x197 [md_mod] Mar 9 08:39:45 Tower kernel: [<f8fc22b7>] ? md_do_recovery+0x115/0x197 [md_mod] Mar 9 08:39:45 Tower kernel: [<f8fc2948>] ? md_thread+0xd6/0xed [md_mod] Mar 9 08:39:45 Tower kernel: [<c103b5c5>] ? autoremove_wake_function+0x0/0x2f Mar 9 08:39:45 Tower kernel: [<f8fc2872>] ? md_thread+0x0/0xed [md_mod] Mar 9 08:39:45 Tower kernel: [<c103b2ca>] ? kthread+0x62/0x67 Mar 9 08:39:45 Tower kernel: [<c103b268>] ? kthread+0x0/0x67 Mar 9 08:39:45 Tower kernel: [<c1002cf6>] ? kernel_thread_helper+0x6/0x10 Mar 9 08:39:45 Tower kernel: Code: eb f6 5d c3 55 89 e5 9c 59 fa ba 00 01 00 00 3e 66 0f c1 10 38 f2 74 06 f3 90 8a 10 eb f6 89 c8 5d c3 55 89 e5 fa ba 00 01 00 00 <3e> 66 0f c1 10 38 f2 74 06 f3 90 8a 10 eb f6 5d c3 55 89 e5 fe Mar 9 08:39:45 Tower kernel: EIP: [<c131ffdf>] _raw_spin_lock_irq+0x9/0x1a SS:ESP 0068:f1973e18 Mar 9 08:39:45 Tower kernel: CR2: 0000000000000040 Mar 9 08:39:45 Tower kernel: ---[ end trace 384f006f2c70e0a5 ]--- Mar 9 08:39:45 Tower emhttp: shcmd (420): rmmod md-mod |& logger Mar 9 08:39:45 Tower emhttp: shcmd (421): udevadm settle Anything I should worry about? Thanks, jim
March 9, 201214 yr How did you disable the mover because there is no "disable" setting on the interface? Did you use the drop-down box which just disables the logging? That generate schedule from the syslog set it to run at 3:40am so it was not disabled.
March 9, 201214 yr Author Ok.. Do I feel like an idiot! I always thought that it was to disable the mover script... not just the logging! After looking back it *does* say "mover logging" enable/disable! Maybe next time I'll read the page better! Well that does explain a lot!!!! Well now, at least, I know I can wipe the drive!!! Jim "wearing the dunce cap right now"
March 9, 201214 yr ;D Shit happens. It likely just made your poor disks work harder rebuilding and moving at the same time.
March 9, 201214 yr Author I wondered why it seemed to take a couple hours longer than it should have!
Archived
This topic is now archived and is closed to further replies.