starcat Posted January 11, 2015 Share Posted January 11, 2015 Hi guys, I am experiencing here a strange phenomenon writing to an array, maybe some kind soul might point me in a right direction. rsync breaks sometimes with the following message rsync: writefd_unbuffered failed to write Restarting rsync again might work or not. Also from time to time "ls -l" takes very long. Then it works again. It doesn't matter if I use rsync locally or remotely (SMB exported disk, writing to disk shares only), same rsync behavior. This happens very often (but not always) to all disks. Writing using dd to a file on each disk is OK. smartctl -a, hdparm -I, fdisk -l, are all OK. There are no error messages in the syslog about any deffective disk whatsoever. All disks are ST4000DM000, the parity disk is however a Seagate Constellation (ST4000NM0033). unRaid version is 5.0.5 No cache drive used, writing directly to disks in the parity protected array. Any ideas? I kind of suspect my parity disk... I can remember when I was first copying data to the array I had the array enabled without a parity disk and everything was OK. Also the rsync copy speed is different, ranging from 20MB/s to 50MB/s with the same file (as if it would depend on the location on disk). Sometimes I only get couple of KB/s and then it breaks with that error message. When I do parity checks or sync after replacing a defective disk, I constantly get 100-130MB/s on the array! I have disabled everything that might be disabled, cache_dis, plugins, etc. No change. config: Software: unRaid Server Pro 5.0.5 Mainboard: Supermicro X7SBE with SIM1U+ Memory: 4x Kingston KVR800D2E5/2G CPU: Intel E6550 HBAs: 2x Supermicro AOC-SAT2-MV8 Data disks: Seagate ST4000DM000 Parity disk: Seagate ST4000NM0033 Case: Norco 4220 PSU: 750HX Quote Link to comment
starcat Posted January 11, 2015 Author Share Posted January 11, 2015 Just upgraded to 5.0.6 with no change. Attaching smartctl -a, hdparm -I, fdisk -l output on all disks. Also, Reallocated_Sector_Ct, Current_Pending_Sector, UDMA_CRC_Error_Count are all 0 on all disks. test-disks.output.txt.zip Quote Link to comment
dgaschk Posted January 11, 2015 Share Posted January 11, 2015 See check disk file systems in my sig. Quote Link to comment
starcat Posted January 12, 2015 Author Share Posted January 12, 2015 Thanks, checking the file systems now. Takes quite long for 4TB drives. Will report soon. Thanks again. Quote Link to comment
starcat Posted January 20, 2015 Author Share Posted January 20, 2015 Took me very long, but I just finished checking all drives with "reiserfsck --check /dev/mdX" and all filesystems are fine with no corruption! Parity verify also ends up with 0 corrections! Any other ideas what I might try?? Is it possible that the Seagate Constellation (ST4000NM0033) parity doesn't work well with the Seagate ST4000DM000 disk drives? Quote Link to comment
dgaschk Posted January 21, 2015 Share Posted January 21, 2015 Attach a syslog after the problem occurs. Quote Link to comment
starcat Posted January 24, 2015 Author Share Posted January 24, 2015 Attaching syslog below. Write is very slow in the kB-range. Sometimes ls -l takes long time, then it goes again. Running parity do not show any errors. Filesystem check doesn't bring any errors. Edit: While parity checking goes with 90-120MB/s... syslog.latest.gz Quote Link to comment
dgaschk Posted January 25, 2015 Share Posted January 25, 2015 Took me very long, but I just finished checking all drives with "reiserfsck --check /dev/mdX" and all filesystems are fine with no corruption! Parity verify also ends up with 0 corrections! Any other ideas what I might try?? Is it possible that the Seagate Constellation (ST4000NM0033) parity doesn't work well with the Seagate ST4000DM000 disk drives? How did you check parity? It has no file system and should have given an error. Did you mean cache drive? Quote Link to comment
dgaschk Posted January 25, 2015 Share Posted January 25, 2015 How are you shutting down? An automatic parity check starts after an unclean shutdown. This is indicated in the syslog. Parity checks will slow performance. Quote Link to comment
starcat Posted January 25, 2015 Author Share Posted January 25, 2015 Took me very long, but I just finished checking all drives with "reiserfsck --check /dev/mdX" and all filesystems are fine with no corruption! Parity verify also ends up with 0 corrections! Any other ideas what I might try?? Is it possible that the Seagate Constellation (ST4000NM0033) parity doesn't work well with the Seagate ST4000DM000 disk drives? How did you check parity? It has no file system and should have given an error. Did you mean cache drive? I check parity via the unRaid Main menu on the web console. I select check parity but uncheck "correct parity errors". It runs through showing 0 parity errors. I do not reisercheckfs on the parity drive. I don't have a cache drive installed. Quote Link to comment
starcat Posted January 25, 2015 Author Share Posted January 25, 2015 How are you shutting down? An automatic parity check starts after an unclean shutdown. This is indicated in the syslog. Parity checks will slow performance. I was rebooting the server with "/sbin/reboot" at that time after reverting to plain stock go file. I was then canceling the parity check. At some point in time performance might be OK. But then it slows down to the kB range and then at some point in time, copying stops at all. ls -l takes at this moment forever. rsync (remote or locally, doesn't matter) stops with unbuffered error. There is no entry in the syslog when this happens. After some time, the server will eventually recover, ls -l and subsequent calls to rsync might continue for some time. Then it stops again!! CPU load is minimal. Quote Link to comment
dgaschk Posted January 26, 2015 Share Posted January 26, 2015 Shutting down with reboot is like pulling the power. It is very bad for the filesystems and they will have to replay journal transactions when mounted. It can also cause damage to the filesystems metadata. Install the powerdown add-on or use the GUI. You can also use wget to call the shutdown URL. But this only works as long as emhttp is running. Quote Link to comment
starcat Posted July 31, 2016 Author Share Posted July 31, 2016 Still experiencing the same problem. Filesystems are clean, parity is clean... from time to time I just get the "rsync: writefd_unbuffered failed to write" error to no avail. Sometimes later, the copy might continue... very strange. Any help is highly appreciated. Quote Link to comment
RobJ Posted August 1, 2016 Share Posted August 1, 2016 You are having an issue with a very old version of unRAID, so the best advice is to Upgrade to UnRAID v6, currently version 6.1.9. It's free if you have paid licenses. There have been 1000's of bugs fixed in Linux, and probably some in rsync, so it's very likely the problem has long been fixed. There is no further development for v5. Quote Link to comment
starcat Posted August 3, 2016 Author Share Posted August 3, 2016 Thanks Rob, I know. I was already reading your Wiki on upgrading to v6 and will do as soon as I find some spare time. I have a pro license, so this is ok. Thanks for your feedback! Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.