delayed write errors


Recommended Posts

Using the same hardware that Lime-Tech currently sells (SuperMicro C2SEE, the 4 port Adaptec PCI-E controller, etc)

 

Current system for testing is configured as:

 

UnRaid 4.4.2 Pro

1 Parity

1 Write cache

1 Disk

 

Using TotalCommander to copy from an old NAS to UnRaid.

 

Problem: When copying single files, they complete fine. However, when I try to copy a large batch of files (~ 200gigs worth), I receive a delayed write error.

 

I have never seen this problem in the past couple of years that I have used this system with my ReadyNAS, so I would suspect that it has to do with something in UnRaid.

 

 

Any suggestions here?

 

Link to comment

Using the same hardware that Lime-Tech currently sells (SuperMicro C2SEE, the 4 port Adaptec PCI-E controller, etc)

 

Current system for testing is configured as:

 

UnRaid 4.4.2 Pro

1 Parity

1 Write cache

1 Disk

 

Using TotalCommander to copy from an old NAS to UnRaid.

 

Problem: When copying single files, they complete fine. However, when I try to copy a large batch of files (~ 200gigs worth), I receive a delayed write error.

 

I have never seen this problem in the past couple of years that I have used this system with my ReadyNAS, so I would suspect that it has to do with something in UnRaid.

 

 

Any suggestions here?

 

When I drive I sometimes don't get to my destination.  Can you tell me why?  ;)

 

To be able to help me, you would need a lot more information.  It would not help if all I said was "I was driving a brown 2009 Toyota." 

 

What would help us to help you is:

Where did you see the error message? (On windows? or in your unRAID syslog? or in TotalCommander? or on your other NAS?)

What was the exact error message?

Are you getting network I/O errors?  (output of ifconfig eth0 would help)

Are your disks working properly? Many of the recent Seagate drives go to sleep for periods of time due to a firmware bug, that might make TotalCommander error as it waits. (a copy of your syslog, before you reboot, but after you experienced the error would tell a LOT)

What other activity might have been occurring on your LAN to cause issues?

 

If you google "delayed write error" you can see lots of causes, including this from microsoft:

http://support.microsoft.com/kb/321733

or this

http://www.gibni.com/windows-delayed-write-failed-solved

 

It could be an issue with the SAMBA version in 4.4.2 unRAID.  It is a fairly new version (Samba 3.2.3) but I notice that has been updated about three or four times since then, and the current "stable" release of Samba is 3.3.0.   At some point, it will end up in an unRIAD release.  But since many of us are using the 4.4.2 version and not having issues, who knows.  Most of us are not transferring 200Gig at once... it might have to do with the size of the transfer.

 

Most helpful would be a copy of your syslog.  See the wiki for instructions on how to post one. 

 

Joe L.

Link to comment

The error was a popup from windows & could also be seen in the event viewer.

 

Parity and Disk1 are the nefarious Seagate 1.5TB's, however they are recently shipped from Dell (about 1 month ago) -- with CC1H firmware I believe.

 

 

Link to comment

The error was a popup from windows & could also be seen in the event viewer.

 

Parity and Disk1 are the nefarious Seagate 1.5TB's, however they are recently shipped from Dell (about 1 month ago) -- with CC1H firmware I believe.

 

 

 

As per: http://support.microsoft.com/kb/321733 posted by Joe L.

 

To determine whether the client is experiencing the problem that is described in this article, check the event log. The event log must contain event ID 50 with a source of MrxSMB. The event contains the same text as the error message, but also contains the error status in the data section. Double-click the event, and then click Words for the data type; the last word contains the status. If the status code is c0000022 (which translates to STATUS_ACCESS_DENIED), apply the hotfix that is described in this article.

 

if you are saying that this is what is happening then according to the article this is a windows problem solved by installing a hotfix.  the hotfix can be found on a link in that article which I will copy here: http://support.microsoft.com/hotfix/KBHotfix.aspx?kbnum=321733&kbln=en-us

 

if that is not the problem, i found this article which has other ideas to try (I didn't look over it but saw this article referenced MANY times in relation to this error).

http://searchwinit.techtarget.com/tip/0,289483,sid1_gci1041334,00.html

 

Cheers,

Matt

Link to comment

atari is a friend of mine and we are working together to resolve this issue.  We both have the same exact nas hw, down to the case.  I now have an account so I'll post some answers and more questions. :-\

 

First, I have XP sp3, which means none of the hotfix solutions apply, but thanks for posting anyway.  The hotfix recommended is for XP sp1 systems only and cannot be installed on a system with sp3.  That said, I did a search and it doesn't appear that MS included Q321733 in sp3, thanks a lot MS!  I also searched and was not able to find any post sp3 hotfixes for delayed write errors.  I also reviewed the other link re delayed write errors (thanks for posting), none of those situations seem to apply either.  Again, this issue is only when writing to the unraid nas.

 

As atari already said, there is a popup box saying that a delayed write failure had occurred and data has been lost.....bla..bla..  The event viewer shows the error as well, Event ID 50.  My only options is to skip that file, which the copy then proceeds to the next file, or abort.  Of course I abort.  I've copied TB's of data back and forth without issues when going to my ReadyNas or Linksys Nas.  My copy attempts with unraid nas starts out fine, runs for a hour, sometimes 2 before erroring out. 

 

Ok, you need more information:

 

With my ReadyNas and a Linksys NAS200 and do not have any issues copying files between them using this same system.  The errors only started after I attempted to copy to my new unraid nas, which means I'm not sure the problem is my XP system.  My system has a gig network adaptor, and so does my ReadyNas.  I have a gig switch (netgear).  I connected my new UnRaid nas via gig as well (of course).

 

My unRaid console shows no errors.  ifconfig eth0 shows lots of packets tx rx and no errors at all.  The unraid server console shows no errors as well, on the Main page.  My cache drive is a ST3320620AS.

Memtest ran for a long time, several loops with no errors.  Temps all look good.  I have not made any changes so unraid config is default (not that is much to change anyway).  Spin down time is set to 1 hour, etc.

I checked the cables, everything looks ok.

 

I'm lost as to what I can check now.  Should I try to confirm drive firmware?  If yes, how?  Should I try taking the cache drive out of the mix (would hate to do that)?  Is there any other diags I can run on my XP or unraid system?

 

Thanks!

 

 

Link to comment

First let me point out that the volume of data you are copying in one sitting likely exceeds what most people with established array do.  You could have found a bug.  Or you might have found an incompaibility.  Or you may have a client issue.

 

The first thing to narrow down is whether this is happening with more than one client.  Do you have another machine (maybe running XP2 and not XP3), that you could test with?  If you are getting the same error from 2 separate clients, it kind of points at the server.  I have, over the years, had issues copying files to and from Windows clients (without unRAID involved).  The solution required reloading the OS, because I had no real way to diagnose.

Link to comment

I hear what you are saying, but if it is not the unraid server, then why can I copy gigs of data to and from my xp system to my ReadyNas without issues?  It seems to be pointed at the server, in my opinion.  That said, I do have a mac, so I'll connect to both my readynas and unraid server and try the copy that way.  I don't have my xp laptop with me right now, but it has sp2 and I can try that as well next week.  While I feel is pointed to the server, MS OS do suck.

 

I took a few screen shots, see attached file.

 

Also on the server, I have Local Master set to No and smb ports are default, 445,139.

 

Being my cache drive is only 320gb, what happens if I'm trying to copy 500gb?  Will it just move the cache data to my other drive(s) on the fly, or will that cause a disk full issue?  When I get the delayed write error, Total Commander says disk is full.  See the attached file for screen shots.

 

Thanks

Link to comment

Being my cache drive is only 320gb, what happens if I'm trying to copy 500gb?  Will it just move the cache data to my other drive(s) on the fly, or will that cause a disk full issue?  When I get the delayed write error, Total Commander says disk is full.  See the attached file for screen shots.

 

Thanks

 

no the mover is scheduled to run once a day... once you have filled up your cache drive (moved 320gb of data to it) then it is full and the mover needs to run to clear it for you BEFORE You can add more... it sounds like your problem is because you reach 320gb(filling your cache drive) and there is no more room for more info.

 

I would disable the cache drive if you are moving LOTS of information over (initial dump on the server) and use the cache when your daily data addition is less than 320gb.

 

Cheers,

Matt

Link to comment

I was about to post but now see Biggy's post and won't repeat.

 

I believe (but not sure) that if unRAID detects that too big a file is being copied, it will copy it directly to the array and not to cache.  But there are two ways files are created.  In one scenario, the size of the file is broadcasted at the very beginning of the transfer, and in the other the file is just sent and the recipient has no idea how much is coming.  Obviously only the former would be able to intelligently decide where to place the file.  (I also believe that if you are OVERWRITING a file already on the array, it will copy to the array and not use the cache.)

 

Note that you can copy directly to a particular disk (e.g., disk3) rather than to the user share and it will not try to use the cache disk.  I believe that it is also faster since it does not go through the fuser layer.

 

Once your array(s) are loaded up, your 320G cache should be plenty.

Link to comment

Being my cache drive is only 320gb, what happens if I'm trying to copy 500gb?  Will it just move the cache data to my other drive(s) on the fly, or will that cause a disk full issue?  When I get the delayed write error, Total Commander says disk is full.  See the attached file for screen shots.

 

Thanks

 

no the mover is scheduled to run once a day... once you have filled up your cache drive (moved 320gb of data to it) then it is full and the mover needs to run to clear it for you BEFORE You can add more... it sounds like your problem is because you reach 320gb(filling your cache drive) and there is no more room for more info.

 

I would disable the cache drive if you are moving LOTS of information over (initial dump on the server) and use the cache when your daily data addition is less than 320gb.

 

Cheers,

Matt

 

woops, so busy trying to save the pics and upload, didn't see this reply.  Wow, that is too easy.  So I was simply filling up the cache drive and that caused the Full disk error and delayed write?  wow. Ok will try to disable the cache for inital large copy and see if that works.  Thanks a bunch!

 

Link to comment

ok, I'm trying to figure out how not to loose what I already have on the cache drive.  I've gone into the Shares tab and selected Move Now, but nothing is happening.  The cache drive still shows 26gb free.  The settings on the cache drive are , Min free space = 2000000 and Mover Schedule = 40 3 * * *

 

I'm watching and refreshing the Main screen and do not see the cache drive free space increasing, meaning that it doesn't look like the cache drive is moving anything???

 

When I connect directly to the cache and disk1, I see most of the files are still on the cache drive??  I'm thinking the Move Now doesn't work

 

Suggestions?

 

Thanks

Link to comment

Not sure why, but the cache drive is now moing files....Main page is now showing 78gb free and increasing....guess it just takes a while.  What is the best process to disable the cache drive, without loosing anything?  Do I just stop the array and remove the Cache drive on the Devices tab?

 

While I only have 1 data disk right now, I have 7 more to  add and I want to use Split Level so my data is stored using all available disks.

 

Thanks!

Link to comment

The "moves" dein the cache drive to another drive must be performed as a copy, then a remove from cache if the copy returns no errors.  If "moving" a large file, it can take a while.  You might not see any "change" in free space until the first file is finally removed from the cache drive. 

 

A copy of a 4 gig file from one drive to another within the server still takes a while to perform, no matter how you do it.  The only condition where it can take less time is if you move a file within the same file-system on the same disk.  Then you are only really moving the directory entries that point to the file and not the data of the file itself.  A move within a file system is almost instant.

 

Joe L.

Link to comment

Just wanted to close this one out.  The delayed write errors were really the cache drive running out of space.  I'll check the Future Options request area to see if there are plans to change this in future upgrades.  It would have been nice if the server would have just started writing directly to the drives when the cache drive got full. 

 

That said, I would rather have a disk full issue than a darn delayed write one!

 

Thanks for all of the help.

Link to comment

Just wanted to close this one out.  The delayed write errors were really the cache drive running out of space.  I'll check the Future Options request area to see if there are plans to change this in future upgrades.  It would have been nice if the server would have just started writing directly to the drives when the cache drive got full. 

 

That said, I would rather have a disk full issue than a darn delayed write one!

 

Thanks for all of the help.

The ability to do as you ask is probably already there... But... it completely depends on the program creating the files.

 

Some programs open a file to write to it with an initial size of zero bytes, and then write the contents filling it.  If your cache drive had ANY free space, even just a few bytes, this will gladly open up a file on the cache drive and then fail as soon as it ran out of space.  It doesn't have to be just a few bytes left though... There is no way for the unRAID server to know you will be writing 5 gigs of space in a single .ISO when there there is 4 gigs of free space and your copy program starts with a zero length file and then fills it, growing its size as it goes.  Iw will copy the first 4 gigs and then fail.

 

Other programs create the initial file at its full size, and then fill it.  In this case, the cache drive would not have room for the full size file, and it would instead be created directly on the "protected disk" and bypass the cache entirely.  It would be slower, as parity would be calculated as it goes, but it would be successful.  A program using this method would not fail when the cache drive filled.

 

It all depends on the program you are using to copy the files to the unRAID server...  There is no way for it to predict the eventual size of a new zero length file.  (I personally don't use a cache drive, but then I've loaded my server incrementally and was never migrating from another server.  The "write" speed was never an issue for me.)

 

Joe L.

Link to comment

Thanks Joe.  What program do you use to copy?  I'm using TotalCommander.

From windows I just use file-explorer.

 

But.. I don't use a cache drive and I almost always copy to the disk share and not to the user share. I prefer to know exactly where my files end up.

 

My user-shares are read only. My disk shares are read/write, but hidden, so they do not show up in windows file-explorer unless you type its path explicitly.

 

On linux, at the command prompt, I use "cp"  but then I've been using unix/linux for nearly 30 years.

 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.