"smbd not tainted" error


Recommended Posts

I was writing a long detailed message when storm rumbled in and power out, so this is actually the 'short' version.  :)  This could be a Samba bug, but the one people reported that was similar was supposedly fixed in early 2010.  I see many "not tainted" error posts in the forum but none specifically for smbd.

 

Was FTP'ing directly to the array, FileZilla, Windows 7, mapped drive.  Went OK for most of it then came back to find it stopped and could not resume the file transfers, though resume had worked earlier.  Checked the array, I could read, and tested writing a small file to it, but I cannot resume writing those last few files.

 

Did a memtest just to be sure, no errors over almost 2 complete passes.  Below is the error from the time I had the problem, attached is the full log in HTML with highlighting etc.

 

Been running this 4.7 free version for only about a week, have written about 2TB to it, hardware:

Gigabyte GA-MA785GM-US2H

2GB DDR2 RAM (2x1 Corsair)

1x Hitachi and 3x WD EARS (1 not in array until I get Pro license)

Antec Basiq 550 Plus

 

Aug 18 06:14:02 unraid1 kernel: Pid: 4679, comm: smbd Not tainted (2.6.32.9-unRAID # GA-MA785GM-US2H (Errors)
Aug 18 06:14:02 unraid1 kernel: EIP: 0060:[<c1133477>] EFLAGS: 00210246 CPU: 0
Aug 18 06:14:02 unraid1 kernel: EIP is at radix_tree_lookup_element+0x47/0x68
Aug 18 06:14:02 unraid1 kernel: EAX: 00000000 EBX: 00000000 ECX: 00000000 EDX: 01000010
Aug 18 06:14:02 unraid1 kernel: ESI: 00133800 EDI: 00000001 EBP: c20cbdd0 ESP: c20cbdc0
Aug 18 06:14:02 unraid1 kernel:  DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
Aug 18 06:14:02 unraid1 kernel: Process smbd (pid: 4679, ti=c20ca000 task=f77006e0 task.ti=c20ca000)
Aug 18 06:14:02 unraid1 kernel: Stack:
Aug 18 06:14:02 unraid1 kernel:  00000001 c509a220 c3703770 00133800 c20cbdd8 c11334a5 c20cbdf8 c1048a7f
Aug 18 06:14:02 unraid1 kernel: <0> c3703774 001337ff 00133800 c509a220 c3703770 00133800 c20cbe0c c1048c59
Aug 18 06:14:02 unraid1 kernel: <0> c509a220 00001000 ffffffff c20cbe30 c1048eb0 000000d0 00133800 c3703770
Aug 18 06:14:02 unraid1 kernel: Call Trace: (Errors)
Aug 18 06:14:02 unraid1 kernel:  [<c11334a5>] ? radix_tree_lookup_slot+0xd/0xf (Errors)
Aug 18 06:14:02 unraid1 kernel:  [<c1048a7f>] ? find_get_page+0x1d/0x79 (Errors)
Aug 18 06:14:02 unraid1 kernel:  [<c1048c59>] ? find_lock_page+0x13/0x4c (Errors)
Aug 18 06:14:02 unraid1 kernel:  [<c1048eb0>] ? grab_cache_page_write_begin+0x32/0x8e (Errors)
Aug 18 06:14:02 unraid1 kernel:  [<c111cb81>] ? fuse_file_aio_write+0x286/0x4fa (Errors)
Aug 18 06:14:02 unraid1 kernel:  [<c1225643>] ? sock_common_recvmsg+0x31/0x4a (Errors)
Aug 18 06:14:02 unraid1 kernel:  [<c106c46d>] ? do_sync_write+0xbb/0xf9 (Errors)
Aug 18 06:14:02 unraid1 kernel:  [<c103391d>] ? autoremove_wake_function+0x0/0x30 (Errors)
Aug 18 06:14:02 unraid1 kernel:  [<c103391d>] ? autoremove_wake_function+0x0/0x30 (Errors)
Aug 18 06:14:02 unraid1 kernel:  [<c106cc4b>] ? vfs_read+0xfd/0x114 (Errors)
Aug 18 06:14:02 unraid1 kernel:  [<c106c3b2>] ? do_sync_write+0x0/0xf9 (Errors)
Aug 18 06:14:02 unraid1 kernel:  [<c106cac4>] ? vfs_write+0x8c/0x116 (Errors)
Aug 18 06:14:02 unraid1 kernel:  [<c106d095>] ? sys_pwrite64+0x44/0x5d (Errors)
Aug 18 06:14:02 unraid1 kernel:  [<c1002935>] ? syscall_call+0x7/0xb (Errors)
Aug 18 06:14:02 unraid1 kernel: Code: f6 75 41 eb 38 89 c2 83 e2 fe 8b 02 3b 34 85 10 46 3f c1 89 45 f0 77 2c 6b c0 06 8d 58 fa 89 f0 88 d9 d3 e8 83 e0 3f 8d 54 82 10 <8b> 02 85 c0 74 13 ff 4d f0 74 07 83 eb 06 89 c2 eb e1 85 ff 0f 
Aug 18 06:14:02 unraid1 kernel: EIP: [<c1133477>] radix_tree_lookup_element+0x47/0x68 SS:ESP 0068:c20cbdc0
Aug 18 06:14:02 unraid1 kernel: CR2: 0000000001000010
Aug 18 06:14:02 unraid1 kernel: ---[ end trace 2a8e219fb36cf29d ]---

unraid_log_snip.html.txt

unraid_log_snip.txt

Link to comment

Well it looks like I am able to reproduce this.  After doing the memtest, rebooting, etc. I decided to try the same FTP transfer again, which had 3 files remaining to be resumed.  This time, 1 of those 3 completed, but again eventually it failed.  Same as before, the array is readable and I am even able to write test files via Windows Explorer, but Filezilla cannot resume the 2 remaining files.

 

I didn't mention this in the first post, but after this happens I try to stop the array from the web admin and the drives status changes to "unmounting" but the array never stops.  Shortly after the unraid server becomes unresponsive and I have to force it to power off.

 

To recap, I have the unraid share as a mapped drive on Windows 7 64bit.  I am downloading from an FTP server using Filezilla, and Filezilla is downloading the files directly to the unraid box via the mapped drive.  Just to be clear, this has nothing to do with FTP actually on the unraid box.  The files were movies, about 4GB-15GB.  I just thought about the fact that resuming a file might cause problems with unraid's allocation needs, but I have 1.4TB free space on the array, and also I wouldn't expect such a failed write to cause other problems like not being able to stop the array and then crashing the server.

 

So, am I expecting too much to be able to FTP directly to an unraid box treating it the same as any drive?  I do this all the time with my small 2 bay NAS and it works fine, but I do realize there's a big difference between those hardware based dedicated appliances and unraid.

Link to comment

What is the min free space setting on the share?

 

I set it to 60GB, so even with Filezilla writing 3 files at a time I doubt that would have exceeded 60GB, most movies don't even go above 8GB.  This did make me think though, maybe writing 3 files at a time is too much for unraid?  The speed is max 16Mbits so that's very slow but maybe it's something to do with slow sustained writing of multiple files, just throwing out some guesses.

 

When using Windows explorer or Mac finder, I have previously copied directories of hundreds of GBs to the array, so in that way it does seem to work OK.

Link to comment

This sounds like a FileZilla issue. Can you test with a different ftp client?

 

I will find one and try.  There has to be other users out there using Filezilla and doing probably the exact same thing, maybe I will ask for experiences in the forums.

 

Do the LimeTech people look at these forums?  I think the fact that this messes up the server in such a way that you cannot perform a clean shutdown and/or crashes it, is pretty serious. 

Link to comment

I'm looking at the memory allocation failure in a radix lookup, probably on sdc. (is that where the files were being written?) 2GB should be plenty but a filesystem problem (recursion monster) could back the system up against the transfers. Multiple transfers would encourage the problem.

 

Time to check sdc's filesystem? Disable user mods and see if the problem waits until 4 transfers, etc.

Link to comment

I'm looking at the memory allocation failure in a radix lookup, probably on sdc. (is that where the files were being written?) 2GB should be plenty but a filesystem problem (recursion monster) could back the system up against the transfers. Multiple transfers would encourage the problem.

 

Time to check sdc's filesystem? Disable user mods and see if the problem waits until 4 transfers, etc.

 

Do I check sdc's filesystem using the standard Linux tools from the terminal?  What do you mean by see if the problem waits until 4 transfers?

 

Thanks.

Link to comment

^that.

 

What do you mean by see if the problem waits until 4 transfers?

 

What I meant was watch for it to happen further into the transfers, or with more concurrent transfers than it does now. In most cases, multiple instances of anything will consume multiple pools of memory, triggering memory availability problems earlier. Reducing the number of transfers and other things running should increase available memory and possibly delay the problems. But really, if something has run away after encountering a corrupted directory, well, more memory isn't a fix.

 

BTW, depending on the depth of the directory structures and the memory available, reiserfsck can have trouble rebuilding things. I'd at least disable all your add-ons & reboot before starting a rebuild. Edit: ...before starting a rebuild, should it come to that. Don't worry about it for the --check part.

Link to comment

I didn't run the fsck because I wasn't sure if it was OK to unmount one of the md volumes, I wasn't sure if it would mess up the array or something.  When the array is stopped they don't exist, so was I supposed to run fsck on the individual drive devices?

 

Anyway, I disabled all addons, finished that FTP queue with one transfer at a time.  Then I ran another FTP queue of about 30GB using 2 transfers at a time and that also completed successfully.  I will now re-enable some addons one at a time and see what happens.  I don't know if I will bother to try 3 transfers again, 2 is good enough as long as I can successfully complete large queues without killing unraid.

Link to comment

I ran the fsck checks on both md vols, no errors.  So would the final word on this be that the server simply couldn't handle 3 simultaneous transfers?  Would increasing the RAM make any difference? 

 

Actually I don't think I can add RAM since I have a monster HS+fan over the first 2 slots, but anyway it would be good to know that this was simply caused by insufficient RAM.  I'm still in my learning/evaluation stage on the free version, but I think I'll be buying Pro soon and adding more drives, at which time I will probably be back here for help again :).

 

Thanks to all.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.