Jump to content
We're Hiring! Full Stack Developer ×

HELP!! Problems writting to array, now can't stop array


Recommended Posts

Okay I just setup my first unraid server last week. Everything seemed to go smoothly. I copied over ~ 4 TB of data on 2 WD 2TB drives that I precleared with the script + 2 TB parity. I came home this afternoon to write some files to the server. My server has been idle since last night except preclearing an additional drive.

 

I went to write a file to \\tower\disk1 and it wouldn't go. So i went into the management screen and everything looked good. I saw & heard my drives spin up when I went to write the file, but no go. So I thought maybe all of the drives didn't spin up, so I clicked the spin up button and heard an additional drive spin up (probably the movies hard drive).

 

I still couldn't write to the folder a new tv episode, but I could read from it fine (play a previous tv episode).

 

I went and clicked the stop array button with the intent of restarting. That was several minutes ago.

 

After 5+ minutes the one data drive is unmounted with the other drive still showing unmounting.

 

What caused the original problem? Did I do something wrong with stopping the array?

 

I will attach the complete syslog and the Smart data pulled from all three drives, but I didn't see any issues with any of them. I did do a parity check no correct after copying over the 2 drive worth of data, which is passed with no errors.

syslog-2011-08-08.txt

Smart_-_Disk_1.txt

Smart_-_Disk_2.txt

Smart_-_Parity.txt

Link to comment

Since the original post. I looked at the syslog and have a wall of blue text that is along the lines of the following. I uploaded the updated syslog

 

Aug  8 16:47:37 Tower emhttp: shcmd (33): umount /mnt/user >/dev/null 2>$stuff$1 (Other emhttp)

Aug  8 16:47:38 Tower emhttp: shcmd (34): rmdir /mnt/user >/dev/null 2>$stuff$1 (Other emhttp)

Aug  8 16:47:38 Tower emhttp: shcmd (35): umount /mnt/disk1 >/dev/null 2>$stuff$1 (Other emhttp)

Aug  8 16:47:38 Tower emhttp: _shcmd: shcmd (35): exit status: 1 (Other emhttp)

Aug  8 16:47:38 Tower emhttp: shcmd (36): rmdir /mnt/disk1 >/dev/null 2>$stuff$1 (Other emhttp)

Aug  8 16:47:38 Tower emhttp: _shcmd: shcmd (36): exit status: 1 (Other emhttp)

Aug  8 16:47:38 Tower emhttp: shcmd (37): umount /mnt/disk2 >/dev/null 2>$stuff$1 (Other emhttp)

Aug  8 16:47:38 Tower emhttp: shcmd (38): rmdir /mnt/disk2 >/dev/null 2>$stuff$1 (Other emhttp)

Aug  8 16:47:38 Tower emhttp: Retry unmounting disk share(s)... (Other emhttp)

Aug  8 16:47:43 Tower emhttp: shcmd (39): umount /mnt/disk1 >/dev/null 2>$stuff$1 (Other emhttp)

Aug  8 16:47:43 Tower emhttp: _shcmd: shcmd (39): exit status: 1 (Other emhttp)

Aug  8 16:47:43 Tower emhttp: shcmd (40): rmdir /mnt/disk1 >/dev/null 2>$stuff$1 (Other emhttp)

Aug  8 16:47:43 Tower emhttp: _shcmd: shcmd (40): exit status: 1 (Other emhttp)

Aug  8 16:47:43 Tower emhttp: Retry unmounting disk share(s)... (Other emhttp)

Aug  8 16:47:48 Tower emhttp: shcmd (41): umount /mnt/disk1 >/dev/null 2>$stuff$1 (Other emhttp)

Aug  8 16:47:48 Tower emhttp: _shcmd: shcmd (41): exit status: 1 (Other emhttp)

Aug  8 16:47:48 Tower emhttp: shcmd (42): rmdir /mnt/disk1 >/dev/null 2>$stuff$1 (Other emhttp)

Aug  8 16:47:48 Tower emhttp: _shcmd: shcmd (42): exit status: 1 (Other emhttp)

Aug  8 16:47:48 Tower emhttp: Retry unmounting disk share(s)... (Other emhttp)

Aug  8 16:47:53 Tower emhttp: shcmd (43): umount /mnt/disk1 >/dev/null 2>$stuff$1 (Other emhttp)

Aug  8 16:47:53 Tower emhttp: _shcmd: shcmd (43): exit status: 1 (Other emhttp)

Aug  8 16:47:53 Tower emhttp: shcmd (44): rmdir /mnt/disk1 >/dev/null 2>$stuff$1 (Other emhttp)

Aug  8 16:47:53 Tower emhttp: _shcmd: shcmd (44): exit status: 1 (Other emhttp)

Aug  8 16:47:53 Tower emhttp: Retry unmounting disk share(s)... (Other emhttp)

Aug  8 16:47:58 Tower emhttp: shcmd (45): umount /mnt/disk1 >/dev/null 2>$stuff$1 (Other emhttp)

Aug  8 16:47:58 Tower emhttp: _shcmd: shcmd (45): exit status: 1 (Other emhttp)

Aug  8 16:47:58 Tower emhttp: shcmd (46): rmdir /mnt/disk1 >/dev/null 2>$stuff$1 (Other emhttp)

Aug  8 16:47:58 Tower emhttp: _shcmd: shcmd (46): exit status: 1 (Other emhttp)

Aug  8 16:47:58 Tower emhttp: Retry unmounting disk share(s)... (Other emhttp)

Aug  8 16:48:03 Tower emhttp: shcmd (47): umount /mnt/disk1 >/dev/null 2>$stuff$1 (Other emhttp)

Aug  8 16:48:03 Tower emhttp: _shcmd: shcmd (47): exit status: 1 (Other emhttp)

Aug  8 16:48:03 Tower emhttp: shcmd (48): rmdir /mnt/disk1 >/dev/null 2>$stuff$1 (Other emhttp)

Aug  8 16:48:03 Tower emhttp: _shcmd: shcmd (48): exit status: 1 (Other emhttp)

 

 

 

 

I did look in the Disk performance tab of unmenu and see that Device sdd (the one that won't unmount) is doing something. It shows a read of ~ 150 KB/s. But none of the other drives show any other activity.

syslog-2011-08-08_-_updated.txt

Link to comment

A disk cannot be un-mounted if it is busy.  It is busy if a file on it is open for reading or writing, or if a directory on it is the current-working-directory of a process.

 

If you have add-ons running, stop them.  They are probably keeping the disk busy.

If your mover is moving a file to/from the drive (and the disk has the file open as a result), wait for it to complete.

 

Joe L.

Link to comment

I don't have the mover installed. The addons I have installed include:

 

 

Unmenu

bwm-ng - Bandwidth Monitor NG

"C" compiler & development tools

unMENU Image Server

mail and ssmtp

unRAID Status Alert sent hourly by e-mail

Monthly Parity Check

p910nd Shared Printer driver

screen (screen manager with VT100/ANSI terminal emulation)

 

All of the addons were installed via the Unmenu Package installer. Is there a way I can stop all addons? I don't know how to stop the ones that are running.

 

Searching the forums I used the open files in Unmenu to see what file might be open causing the problem:

 

It shows:

 

Open Files

(from /usr/bin/lsof /dev/md*)

COMMAND  PID USER  FD  TYPE DEVICE      SIZE NODE NAME

smbd    11442 root  cwd    DIR    9,1      104    2 /mnt/disk1

smbd    11442 root  29uW  REG    9,1 339902464 6421 /mnt/disk1/TV/True Blood/Season 4/True.Blood.S04E03.720p.HDTV.X264-DIMENSION.mkv

 

That was the file I was trying to copy that didn't see to want to copy there. I stopped that copy when it appeared to freeze. I even restarted the computer that was copying the file to the tower. How can I force that file to close/delete?

Link to comment

I don't have a cache drive. I was trying to copy a file to the server, but not getting anywhere. So I attempted to stop the array. That file is stuck in limbo. I am trying to manually delete that file via the command line now with: root@Tower:/mnt/disk1/TV/True Blood# rm -rf Season\ 4

But that seems to not be working.

 

But the big question is: Why couldn't I write to the array in the first place? Obviously it wasn't permissions since I can see it did start a file, and just got stuck in limbo.

 

Edit: I found there are two stuck processes on that disk:

 

PID    Access  Command

10391 f.c..        rm

11442 F.c..        smbd

 

I tried to kill the process with kill 10391, kill 11442. But the processes remain.

 

I also tried to kill with kill -9 10391, kill -9 11442 but it didn't work. I am really tempted to just push the power button after the other disc finishes its preclear.

 

 

What will happen if I force the power?

Link to comment

I also tried to kill with kill -9 10391, kill -9 11442 but it didn't work. I am really tempted to just push the power button after the other disc finishes its preclear.

When a kill -9 fails, it indicates the process is locked in kernel space. (and you are not likely to regain control)

 

What will happen if I force the power?

It will power down the server.  When it powers up, it will perform a parity check.  It might take as long as 15 minutes to have the file-systems mount, as they will replay their journals first, and I've seen that take a while if you were busy writing to them when you kill the power.

 

If you can, you might try typing

sync

first before killing the power.

It might help to flush some of what was written to the disks from their journels.

Link to comment

Okay, so after killing the process didn't work I figured I would let the server on till I got home from work tomorrow to deal with it.

 

When I woke up this morning, the array had finally stopped. So whatever process(es) that were running must have finally stopped. So I did a clean shutdown and turned back on the server. Everything seemed to boot up fine and I am doing a parity check/nocorrect right now.

 

1. Assuming the parity check passes, everything should be okay right?

 

2. Any ideas of why the machine got stuck in limbo when trying to write that file that began this whole process?

 

 

Link to comment

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...