Help: Write issues to array (writefd_unbuffered failed to write)

starcat · January 11, 2015

Hi guys,

I am experiencing here a strange phenomenon writing to an array, maybe some kind soul might point me in a right direction.

rsync breaks sometimes with the following message

rsync: writefd_unbuffered failed to write

Restarting rsync again might work or not.

Also from time to time "ls -l" takes very long. Then it works again.

It doesn't matter if I use rsync locally or remotely (SMB exported disk, writing to disk shares only), same rsync behavior.

This happens very often (but not always) to all disks.

Writing using dd to a file on each disk is OK.

smartctl -a, hdparm -I, fdisk -l, are all OK.

There are no error messages in the syslog about any deffective disk whatsoever.

All disks are ST4000DM000, the parity disk is however a Seagate Constellation (ST4000NM0033).

unRaid version is 5.0.5

No cache drive used, writing directly to disks in the parity protected array.

Any ideas?

I kind of suspect my parity disk... I can remember when I was first copying data to the array I had the array enabled without a parity disk and everything was OK.

Also the rsync copy speed is different, ranging from 20MB/s to 50MB/s with the same file (as if it would depend on the location on disk). Sometimes I only get couple of KB/s and then it breaks with that error message.

When I do parity checks or sync after replacing a defective disk, I constantly get 100-130MB/s on the array!

I have disabled everything that might be disabled, cache_dis, plugins, etc. No change.

config:

Software: unRaid Server Pro 5.0.5

Mainboard: Supermicro X7SBE with SIM1U+

Memory: 4x Kingston KVR800D2E5/2G

CPU: Intel E6550

HBAs: 2x Supermicro AOC-SAT2-MV8

Data disks: Seagate ST4000DM000

Parity disk: Seagate ST4000NM0033

Case: Norco 4220

PSU: 750HX

starcat · January 11, 2015

Just upgraded to 5.0.6 with no change.

Attaching smartctl -a, hdparm -I, fdisk -l output on all disks.

Also, Reallocated_Sector_Ct, Current_Pending_Sector, UDMA_CRC_Error_Count are all 0 on all disks.

test-disks.output.txt.zip

dgaschk · January 11, 2015

See check disk file systems in my sig.

starcat · January 12, 2015

Thanks, checking the file systems now. Takes quite long for 4TB drives.

Will report soon. Thanks again.

starcat · January 20, 2015

Took me very long, but I just finished checking all drives with "reiserfsck --check /dev/mdX" and all filesystems are fine with no corruption!

Parity verify also ends up with 0 corrections!

Any other ideas what I might try??

Is it possible that the Seagate Constellation (ST4000NM0033) parity doesn't work well with the Seagate ST4000DM000 disk drives?

dgaschk · January 21, 2015

Attach a syslog after the problem occurs.

starcat · January 24, 2015

Attaching syslog below.

Write is very slow in the kB-range. Sometimes ls -l takes long time, then it goes again.

Running parity do not show any errors.

Filesystem check doesn't bring any errors.

Edit:

While parity checking goes with 90-120MB/s...

syslog.latest.gz

dgaschk · January 25, 2015

Took me very long, but I just finished checking all drives with "reiserfsck --check /dev/mdX" and all filesystems are fine with no corruption!

Parity verify also ends up with 0 corrections!

Any other ideas what I might try??

Is it possible that the Seagate Constellation (ST4000NM0033) parity doesn't work well with the Seagate ST4000DM000 disk drives?

How did you check parity? It has no file system and should have given an error. Did you mean cache drive?

dgaschk · January 25, 2015

How are you shutting down? An automatic parity check starts after an unclean shutdown. This is indicated in the syslog. Parity checks will slow performance.

starcat · January 25, 2015

Took me very long, but I just finished checking all drives with "reiserfsck --check /dev/mdX" and all filesystems are fine with no corruption!

Parity verify also ends up with 0 corrections!

Any other ideas what I might try??

Is it possible that the Seagate Constellation (ST4000NM0033) parity doesn't work well with the Seagate ST4000DM000 disk drives?

How did you check parity? It has no file system and should have given an error. Did you mean cache drive?

I check parity via the unRaid Main menu on the web console. I select check parity but uncheck "correct parity errors". It runs through showing 0 parity errors. I do not reisercheckfs on the parity drive. I don't have a cache drive installed.

starcat · January 25, 2015

How are you shutting down? An automatic parity check starts after an unclean shutdown. This is indicated in the syslog. Parity checks will slow performance.

I was rebooting the server with "/sbin/reboot" at that time after reverting to plain stock go file.

I was then canceling the parity check.

At some point in time performance might be OK. But then it slows down to the kB range and then at some point in time, copying stops at all. ls -l takes at this moment forever. rsync (remote or locally, doesn't matter) stops with unbuffered error. There is no entry in the syslog when this happens. After some time, the server will eventually recover, ls -l and subsequent calls to rsync might continue for some time. Then it stops again!! CPU load is minimal.

dgaschk · January 26, 2015

Shutting down with reboot is like pulling the power. It is very bad for the filesystems and they will have to replay journal transactions when mounted. It can also cause damage to the filesystems metadata. Install the powerdown add-on or use the GUI. You can also use wget to call the shutdown URL. But this only works as long as emhttp is running.

starcat · July 31, 2016

Still experiencing the same problem. Filesystems are clean, parity is clean... from time to time I just get the "rsync: writefd_unbuffered failed to write" error to no avail. Sometimes later, the copy might continue... very strange. Any help is highly appreciated.

RobJ · August 1, 2016

You are having an issue with a very old version of unRAID, so the best advice is to Upgrade to UnRAID v6, currently version 6.1.9. It's free if you have paid licenses. There have been 1000's of bugs fixed in Linux, and probably some in rsync, so it's very likely the problem has long been fixed. There is no further development for v5.

starcat · August 3, 2016

Thanks Rob, I know. I was already reading your Wiki on upgrading to v6 and will do as soon as I find some spare time. I have a pro license, so this is ok. Thanks for your feedback!

Help: Write issues to array (writefd_unbuffered failed to write)

Recommended Posts

starcat

Link to comment

starcat

Link to comment

dgaschk

Link to comment

starcat

Link to comment

starcat

Link to comment

dgaschk

Link to comment

starcat

Link to comment

dgaschk

Link to comment

dgaschk

Link to comment

starcat

Link to comment

starcat

Link to comment

dgaschk

Link to comment

starcat

Link to comment

RobJ

Link to comment

starcat

Link to comment

Archived