[SOLVED] Samba not starting during Data-Rebuild


Recommended Posts

Hey all, it's been a long time since I've posted...been quite happy with my 4.5.4 server for a while now.

 

I just upgraded a drive in the box today, clean power-down, replaced the drive with a new one...clean power up.  It expanded the filesystem just fine, but now that it is in Data-Rebuild I was expecting the Samba service to start up.  I tried to start it manually and this is the output from the smbd.log:

 

root@Storage:/var/log/samba# more log.smbd

[2011/01/11 17:14:02,  0] smbd/server.c:1065(main)

 smbd version 3.4.5 started.

 Copyright Andrew Tridgell and the Samba Team 1992-2009

[2011/01/11 17:39:55,  0] smbd/server.c:1065(main)

 smbd version 3.4.5 started.

 Copyright Andrew Tridgell and the Samba Team 1992-2009

[2011/01/11 17:41:18,  0] smbd/server.c:1065(main)

 smbd version 3.4.5 started.

 Copyright Andrew Tridgell and the Samba Team 1992-2009

root@Storage:/var/log/samba#

 

The first one was when the server finished the file system expansion, the other two were me manually trying to start Samba.  There are no cores, no messages in syslog but I can't connect via SMB and ps shows no smb process running:

 

root@Storage:/var/log/samba# ps aux | grep samba

root      3196  0.0  0.0   1732   556 pts/0    S+   17:51   0:00 grep samba

root@Storage:/var/log/samba#

 

Should I expect samba to be available at this point?

Link to comment

Hey all, it's been a long time since I've posted...been quite happy with my 4.5.4 server for a while now.

 

I just upgraded a drive in the box today, clean power-down, replaced the drive with a new one...clean power up.  It expanded the filesystem just fine, but now that it is in Data-Rebuild I was expecting the Samba service to start up.  I tried to start it manually and this is the output from the smbd.log:

 

root@Storage:/var/log/samba# more log.smbd

[2011/01/11 17:14:02,  0] smbd/server.c:1065(main)

  smbd version 3.4.5 started.

  Copyright Andrew Tridgell and the Samba Team 1992-2009

[2011/01/11 17:39:55,  0] smbd/server.c:1065(main)

  smbd version 3.4.5 started.

  Copyright Andrew Tridgell and the Samba Team 1992-2009

[2011/01/11 17:41:18,  0] smbd/server.c:1065(main)

  smbd version 3.4.5 started.

  Copyright Andrew Tridgell and the Samba Team 1992-2009

root@Storage:/var/log/samba#

 

The first one was when the server finished the file system expansion, the other two were me manually trying to start Samba.  There are no cores, no messages in syslog but I can't connect via SMB and ps shows no smb process running:

 

root@Storage:/var/log/samba# ps aux | grep samba

root      3196  0.0  0.0   1732   556 pts/0    S+   17:51   0:00 grep samba

root@Storage:/var/log/samba#

 

Should I expect samba to be available at this point?

The only log that is meaningful is the one at

/var/log/syslog

 

See the sticky at the top of this forum on how to capture it for attachment to a post when asking for assistance.

 

Joe L.

Link to comment

Ok, I'll figure out some way to get it when samba isn't starting and scp isn't included in unraid. 

 

That said, these are the only lines that seem to be interesting:

 

Jan 11 17:14:03 Storage emhttp: stale configuration

Jan 11 17:14:03 Storage emhttp: shcmd (4): rm /etc/samba/smb-shares.conf >/dev/null 2>&1

Jan 11 17:14:03 Storage emhttp: _shcmd: shcmd (4): exit status: 1

Jan 11 17:14:03 Storage emhttp: shcmd (5): cp /etc/exports- /etc/exports

Jan 11 17:14:03 Storage emhttp: shcmd (6): killall -HUP smbd

Jan 11 17:14:03 Storage emhttp: _shcmd: shcmd (6): exit status: 1

Jan 11 17:14:03 Storage emhttp: shcmd (7): /etc/rc.d/rc.nfsd restart | logger

Link to comment

Ok, I'll figure out some way to get it when samba isn't starting and scp isn't included in unraid. 

 

That said, these are the only lines that seem to be interesting:

 

Jan 11 17:14:03 Storage emhttp: stale configuration

Jan 11 17:14:03 Storage emhttp: shcmd (4): rm /etc/samba/smb-shares.conf >/dev/null 2>&1

Jan 11 17:14:03 Storage emhttp: _shcmd: shcmd (4): exit status: 1

Jan 11 17:14:03 Storage emhttp: shcmd (5): cp /etc/exports- /etc/exports

Jan 11 17:14:03 Storage emhttp: shcmd (6): killall -HUP smbd

Jan 11 17:14:03 Storage emhttp: _shcmd: shcmd (6): exit status: 1

Jan 11 17:14:03 Storage emhttp: shcmd (7): /etc/rc.d/rc.nfsd restart | logger

Obviously you are not a beginner...  bit the sticky gives several ways....  the first being:

You can type "//tower/log/syslog" in your browser address bar (if you changed the name of your server, use that name instead of 'tower').  Then select/copy/paste the text, put into a txt file, and attach to your post.

 

The second involves telnet...

See here: http://lime-technology.com/forum/index.php?topic=9880.0

 

I know you are looking for help, but you are looking in the places you are familiar with from other linux distributions.  In unRAID almost all log entries are to /var/log/syslog.

 

Is the array started?   What do you see on the management console web-page?

 

Joe L.

Link to comment

Joe,

 

I managed to get it by remembering that ftp is on by default for unraid.  I didn't want to bother with logging telnet, and //tower/<whatever> would only work if Samba were functioning...in which case the post would have been pointless to start with ;).

 

Right now emhttp is running just fine, and shows the data rebuild in process on the disk I replaced.  When I ftped into the server the disks all showed up including the user directories:

 

ftp> cd user

250 Directory successfully changed.

ftp> dir

229 Entering Extended Passive Mode (|||24455|)

150 Here comes the directory listing.

drwx--x--x    1 0        0            1112 Nov 20 19:26 Angel

drwx------    1 0        0              96 Sep 04 04:48 Camera

drwxr-xr-x    1 0        0              80 Apr 13  2008 Data

drwxr-xr-x    1 0        0              48 Nov 13 22:39 Jukebox

drwx--x--x    1 0        0            312 Sep 25  2008 MUSH Objects

drwxr-xr-x    1 0        0            128 Nov 05 21:35 Media

drwx------    1 0        0            384 Nov 25 01:54 Mike

drwxr-xr-x    1 0        0            1624 Nov 29 18:37 Music

drwx--x--x    1 0        0            536 Oct 02  2009 My Movies

drwx--x--x    1 0        0            448 Sep 22 17:37 Tybio

-rw-r--r--    1 0        0            9128 Jan 11 23:27 syslog.bak.gz

226 Directory send OK.

 

So it looks like unraid it's self is doing just fine, but samba isn't successfully starting for some reason.

Link to comment

/dev/sdc is experiencing many errors  (currently assigned to disk3 ):

an 11 17:14:02 Storage kernel: ata3.00: exception Emask 0x0 SAct 0x2 SErr 0x0 action 0x0
Jan 11 17:14:02 Storage kernel: ata3.00: irq_stat 0x40000008
Jan 11 17:14:02 Storage kernel: ata3.00: failed command: READ FPDMA QUEUED
Jan 11 17:14:02 Storage kernel: ata3.00: cmd 60/08:08:a8:88:e0/00:00:e8:00:00/40 tag 1 ncq 4096 in
Jan 11 17:14:02 Storage kernel:          res 41/01:00:af:88:e0/4c:00:e8:00:00/40 Emask 0x401 (device error) <F>
Jan 11 17:14:02 Storage kernel: ata3.00: status: { DRDY ERR }
Jan 11 17:14:02 Storage kernel: ata3.00: configured for UDMA/133
Jan 11 17:14:02 Storage kernel: ata3: EH complete
Jan 11 17:14:02 Storage kernel: ata3.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x0
Jan 11 17:14:02 Storage kernel: ata3.00: irq_stat 0x40000008
Jan 11 17:14:02 Storage kernel: ata3.00: failed command: READ FPDMA QUEUED
Jan 11 17:14:02 Storage kernel: ata3.00: cmd 60/08:00:a8:88:e0/00:00:e8:00:00/40 tag 0 ncq 4096 in
Jan 11 17:14:02 Storage kernel:          res 41/01:00:af:88:e0/4c:00:e8:00:00/40 Emask 0x401 (device error) <F>
Jan 11 17:14:02 Storage kernel: ata3.00: status: { DRDY ERR }
Jan 11 17:14:02 Storage kernel: ata3.00: configured for UDMA/133
Jan 11 17:14:02 Storage kernel: ata3: EH complete
Jan 11 17:14:02 Storage kernel: ata3.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x0
Jan 11 17:14:02 Storage kernel: ata3.00: irq_stat 0x40000008
Jan 11 17:14:02 Storage kernel: ata3.00: failed command: READ FPDMA QUEUED
Jan 11 17:14:02 Storage kernel: ata3.00: cmd 60/08:00:a8:88:e0/00:00:e8:00:00/40 tag 0 ncq 4096 in
Jan 11 17:14:02 Storage kernel:          res 41/01:00:af:88:e0/4c:00:e8:00:00/40 Emask 0x401 (device error) <F>
Jan 11 17:14:02 Storage kernel: ata3.00: status: { DRDY ERR }
Jan 11 17:14:02 Storage kernel: ata3.00: configured for UDMA/133
Jan 11 17:14:02 Storage kernel: ata3: EH complete
Jan 11 17:14:02 Storage kernel: ata3.00: NCQ disabled due to excessive errors
Jan 11 17:14:02 Storage kernel: ata3.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x0
Jan 11 17:14:02 Storage kernel: ata3.00: irq_stat 0x40000008
Jan 11 17:14:02 Storage kernel: ata3.00: failed command: READ FPDMA QUEUED
Jan 11 17:14:02 Storage kernel: ata3.00: cmd 60/08:00:a8:88:e0/00:00:e8:00:00/40 tag 0 ncq 4096 in
Jan 11 17:14:02 Storage kernel:          res 41/01:00:af:88:e0/4c:00:e8:00:00/40 Emask 0x401 (device error) <F>
Jan 11 17:14:02 Storage kernel: ata3.00: status: { DRDY ERR }
Jan 11 17:14:02 Storage kernel: ata3.00: configured for UDMA/133
Jan 11 17:14:02 Storage kernel: ata3: EH complete
Jan 11 17:14:02 Storage kernel: ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
Jan 11 17:14:02 Storage kernel: ata3.00: irq_stat 0x40000001
Jan 11 17:14:02 Storage kernel: ata3.00: failed command: READ DMA EXT
Jan 11 17:14:02 Storage kernel: ata3.00: cmd 25/00:08:a8:88:e0/00:00:e8:00:00/e0 tag 0 dma 4096 in
Jan 11 17:14:02 Storage kernel:          res 51/01:00:af:88:e0/4c:00:e8:00:00/e0 Emask 0x1 (device error)
Jan 11 17:14:02 Storage kernel: ata3.00: status: { DRDY ERR }
Jan 11 17:14:02 Storage kernel: ata3.00: configured for UDMA/133
Jan 11 17:14:02 Storage kernel: ata3: EH complete
Jan 11 17:14:02 Storage kernel: ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
Jan 11 17:14:02 Storage kernel: ata3.00: irq_stat 0x40000001
Jan 11 17:14:02 Storage kernel: ata3.00: failed command: READ DMA EXT
Jan 11 17:14:02 Storage kernel: ata3.00: cmd 25/00:08:a8:88:e0/00:00:e8:00:00/e0 tag 0 dma 4096 in
Jan 11 17:14:02 Storage kernel:          res 51/01:00:af:88:e0/4c:00:e8:00:00/e0 Emask 0x1 (device error)
Jan 11 17:14:02 Storage kernel: ata3.00: status: { DRDY ERR }
Jan 11 17:14:02 Storage kernel: ata3.00: configured for UDMA/133
Jan 11 17:14:02 Storage kernel: sd 3:0:0:0: [sdc] Unhandled sense code
Jan 11 17:14:02 Storage kernel: sd 3:0:0:0: [sdc] Result: hostbyte=0x00 driverbyte=0x08
Jan 11 17:14:02 Storage kernel: sd 3:0:0:0: [sdc] Sense Key : 0x3 [current] [descriptor]
Jan 11 17:14:02 Storage kernel: Descriptor sense data with sense descriptors (in hex):
Jan 11 17:14:02 Storage kernel:         72 03 13 00 00 00 00 0c 00 0a 80 00 00 00 00 00 
Jan 11 17:14:02 Storage kernel:         e8 e0 88 af 
Jan 11 17:14:02 Storage kernel: sd 3:0:0:0: [sdc] ASC=0x13 ASCQ=0x0
Jan 11 17:14:02 Storage kernel: sd 3:0:0:0: [sdc] CDB: cdb[0]=0x28: 28 00 e8 e0 88 a8 00 00 08 00
Jan 11 17:14:02 Storage kernel: end_request: I/O error, dev sdc, sector 3907029167
Jan 11 17:14:02 Storage kernel: Buffer I/O error on device sdc, logical block 488378645
Jan 11 17:14:02 Storage kernel: ata3: EH complete
Jan 11 17:14:02 Storage kernel: ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
Jan 11 17:14:02 Storage kernel: ata3.00: irq_stat 0x40000001
Jan 11 17:14:02 Storage kernel: ata3.00: failed command: READ DMA EXT
Jan 11 17:14:02 Storage kernel: ata3.00: cmd 25/00:08:a8:88:e0/00:00:e8:00:00/e0 tag 0 dma 4096 in
Jan 11 17:14:02 Storage kernel:          res 51/01:00:af:88:e0/4c:00:e8:00:00/e0 Emask 0x1 (device error)
Jan 11 17:14:02 Storage kernel: ata3.00: status: { DRDY ERR }
Jan 11 17:14:02 Storage kernel: ata3.00: configured for UDMA/133
Jan 11 17:14:02 Storage kernel: ata3: EH complete
Jan 11 17:14:02 Storage kernel: ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
Jan 11 17:14:02 Storage kernel: ata3.00: irq_stat 0x40000001
Jan 11 17:14:02 Storage kernel: ata3.00: failed command: READ DMA EXT
Jan 11 17:14:02 Storage kernel: ata3.00: cmd 25/00:08:a8:88:e0/00:00:e8:00:00/e0 tag 0 dma 4096 in
Jan 11 17:14:02 Storage kernel:          res 51/01:00:af:88:e0/4c:00:e8:00:00/e0 Emask 0x1 (device error)
Jan 11 17:14:02 Storage kernel: ata3.00: status: { DRDY ERR }
Jan 11 17:14:02 Storage kernel: ata3.00: configured for UDMA/133
Jan 11 17:14:02 Storage kernel: ata3: EH complete
Jan 11 17:14:02 Storage kernel: ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
Jan 11 17:14:02 Storage kernel: ata3.00: irq_stat 0x40000001
Jan 11 17:14:02 Storage kernel: ata3.00: failed command: READ DMA EXT
Jan 11 17:14:02 Storage kernel: ata3.00: cmd 25/00:08:a8:88:e0/00:00:e8:00:00/e0 tag 0 dma 4096 in
Jan 11 17:14:02 Storage kernel:          res 51/01:00:af:88:e0/4c:00:e8:00:00/e0 Emask 0x1 (device error)
Jan 11 17:14:02 Storage kernel: ata3.00: status: { DRDY ERR }

 

In addition, your FAT file system on your flash drive is corrupted:

Jan 11 17:14:02 Storage kernel: FAT: Filesystem error (dev sdg1)
Jan 11 17:14:02 Storage kernel:     fat_get_cluster: invalid cluster chain (i_pos 1007895)
Jan 11 17:14:02 Storage kernel:     File system has been set read-only
Jan 11 17:14:02 Storage kernel: FAT: Filesystem error (dev sdg1)
Jan 11 17:14:02 Storage kernel:     fat_get_cluster: invalid cluster chain (i_pos 1007895)
Jan 11 17:14:02 Storage kernel: FAT: Filesystem error (dev sdg1)
Jan 11 17:14:02 Storage kernel:     fat_get_cluster: invalid cluster chain (i_pos 1007895)
Jan 11 17:14:02 Storage kernel: FAT: Filesystem error (dev sdg1)
Jan 11 17:14:02 Storage kernel:     fat_get_cluster: invalid cluster chain (i_pos 1007895)
Jan 11 17:14:02 Storage kernel: FAT: Filesystem error (dev sdg1)
Jan 11 17:14:02 Storage kernel:     fat_get_cluster: invalid cluster chain (i_pos 1007895)
Jan 11 17:14:02 Storage kernel: FAT: Filesystem error (dev sdg1)
Jan 11 17:14:02 Storage kernel:     fat_get_cluster: invalid cluster chain (i_pos 1007895)
Jan 11 17:14:02 Storage kernel: FAT: Filesystem error (dev sdg1)
Jan 11 17:14:02 Storage kernel:     fat_get_cluster: invalid cluster chain (i_pos 1007895)
Jan 11 17:14:02 Storage kernel: FAT: Filesystem error (dev sdg1)
Jan 11 17:14:02 Storage kernel:     fat_get_cluster: invalid cluster chain (i_pos 1007895)
Jan 11 17:14:02 Storage kernel: FAT: Filesystem error (dev sdg1)
Jan 11 17:14:02 Storage kernel:     fat_get_cluster: invalid cluster chain (i_pos 1007895)
Jan 11 17:14:02 Storage kernel: FAT: Filesystem error (dev sdg1)
Jan 11 17:14:02 Storage kernel:     fat_get_cluster: invalid cluster chain (i_pos 1007895)
Jan 11 17:14:02 Storage kernel: FAT: Filesystem error (dev sdg1)

 

I can see where you started the rebuild of the drive:

an 11 17:15:37 Storage emhttp: writing mbr on disk 6 (/dev/sdh)
Jan 11 17:15:37 Storage emhttp: re-reading /dev/sdh partition table
Jan 11 17:15:37 Storage kernel:  sdh: sdh1
Jan 11 17:15:38 Storage kernel: mdcmd (30): start UPGRADE_DISK
Jan 11 17:15:38 Storage kernel: unraid: allocating 54060K for 1280 stripes (10 disks)

 

So.... step 1.  fix the corruption on the flash drive.  (run scandisk/chkdisk in windows on it)

Nothing you do will be able to be saved as a new configuration until it is fixed.

Step 2. power down, check the cabling to disk3.  You may have disturbed it in replacing disk6.  

 

Joe L.

Link to comment

Also, .gz files not supported on the forum?  Interesting!

 

They are now (as of 2 minutes ago)  :P

Thanks Tom, but most of the time I'm here at a Windows Firefox browser, and a .gz extension just complicated my ability to view the syslogs.  I suppose 7zip will un-compress them, only time will tell.  It is just an extra step for me to be able to help somebody, since window's explorer could deal with the .zip extensions up until now.

 

Joe L.

Link to comment

Hurm, I don't suppose there is any benefit to letting the rebuild finish before stopping the array and powering it down to fix the flash drive.  Is there any risk to doing it that way?

 

I'm making the assumption that even if it rebuilds properly the errors on the USB drive are going to prevent unraid from knowing it has been rebuilt...I don't see any down side to stopping the rebuild and fixing the flash right away....but getting input is always wise :).

 

While it's down I'll check the connection to disk3

Link to comment

Hurm, I don't suppose there is any benefit to letting the rebuild finish before stopping the array and powering it down to fix the flash drive.  Is there any risk to doing it that way?

Since disk3 is unreadable, the rebuild is probably not going to be as you want anyways.  I'd stop it.

 

I'd also do a memory check.  There are too many seemingly unrelated issued on your server for it not to be suspect. 

 

Also, what exact model power supply are you using?  It might not be up to the task of powering 12 disks.

I'm making the assumption that even if it rebuilds properly the errors on the USB drive are going to prevent unraid from knowing it has been rebuilt...I don't see any down side to stopping the rebuild and fixing the flash right away....but getting input is always wise :).

It is a very smart person who admits they do not know it all  ;)  You are correct, it would not be able to record that it is rebuilt.

While it's down I'll check the connection to disk3

Sounds like a plan.

 

Joe L.

Link to comment

Ok,  I'll fix the USB before I mark this as solved.

 

Also, I'm able to FTP to the server and see the contense of disk3...and it seems to have mounted properly:

 

root@Storage:/boot/config# mount

fusectl on /sys/fs/fuse/connections type fusectl (rw)

usbfs on /proc/bus/usb type usbfs (rw)

/dev/sdg1 on /boot type vfat (rw,noatime,nodiratime,umask=0,shortname=mixed)

/dev/md4 on /mnt/disk4 type reiserfs (rw,noatime,nodiratime,noacl,nouser_xattr)

/dev/md3 on /mnt/disk3 type reiserfs (rw,noatime,nodiratime,noacl,nouser_xattr)

/dev/md1 on /mnt/disk1 type reiserfs (rw,noatime,nodiratime,noacl,nouser_xattr)

/dev/md7 on /mnt/disk7 type reiserfs (rw,noatime,nodiratime,noacl,nouser_xattr)

/dev/md2 on /mnt/disk2 type reiserfs (rw,noatime,nodiratime,noacl,nouser_xattr)

/dev/md9 on /mnt/disk9 type reiserfs (rw,noatime,nodiratime,noacl,nouser_xattr)

/dev/md8 on /mnt/disk8 type reiserfs (rw,noatime,nodiratime,noacl,nouser_xattr)

/dev/md5 on /mnt/disk5 type reiserfs (rw,noatime,nodiratime,noacl,nouser_xattr)

/dev/md6 on /mnt/disk6 type reiserfs (rw,noatime,nodiratime,noacl,nouser_xattr)

shfs on /mnt/user type fuse.shfs (rw,nosuid,nodev,noatime,allow_other,default_permissions)

root@Storage:/boot/config#

 

This server has been running for...hell, over 2 years now so I wouldn't be shocked if something is getting a little wonky. 

Link to comment

Thanks, I spent a lot of time working with the people here (including you!) to get the right parts and configuration.

 

Update:

 

The USB is now working fine after a checkup on my windows box.

 

I reseated the cables to ata3 and it is still tossing errors, so I backed out the disk upgrade to get the array back to a good footing with all disks seen as valid.

 

I then swapped ata3 out for a brand new 2TB drive of the exact same model and rebooted to see if the error persisted....

 

Server came up and error is gone (And man, did it boot more quickly!).  So it looks like you've helped me isolate two problems...I've got a disk to RMA.

 

Just think, if the USB drive hadn't gone wonky I might have had this disk fail fully while trying to rebuild another drive, which might have been epically bad.

Link to comment

Also, .gz files not supported on the forum?  Interesting!

 

They are now (as of 2 minutes ago)  :P

Thanks Tom, but most of the time I'm here at a Windows Firefox browser, and a .gz extension just complicated my ability to view the syslogs.  I suppose 7zip will un-compress them, only time will tell.  It is just an extra step for me to be able to help somebody, since window's explorer could deal with the .zip extensions up until now.

 

Joe L.

 

Ok, they are now (as of 2 minutes ago) no longer enabled  8)

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.