lionelhutz Posted February 18, 2017 Share Posted February 18, 2017 Well, there is really no mention of ANYTHING to do with this from LT and apparently an all XFS system does this too so I'm not convinced it's a RFS problem. It's always been best to install windows on a clean partition so that hardly qualifies as a comparison to having to wipe-out many drives worth of data... Quote Link to comment
lordsiris Posted February 19, 2017 Share Posted February 19, 2017 I seem to have a similar problem. Came home yesterday to a completely unresponsive server. Had to reset as nothing else worked. Parity check came back clean. Just a while ago it was unresponsive again. Load of 875/875/875. Shfs process at 100%. No plugins, no docker. Don't have logs as wife was complaining because Kodi couldn't play files. So reset again. I did the latest update on Thursday I believe. Quote Link to comment
John_M Posted February 19, 2017 Share Posted February 19, 2017 Have you considered, either as an aid to troubleshooting or as a temporary work-round, disabling user shares? Quote Link to comment
thomast_88 Posted February 21, 2017 Share Posted February 21, 2017 (edited) I have this problem already since 6.3.0-rc5 . It's easy for me to replicate it. If i start writing new files to the cache array (which is 2x 250 gb evos BTRFS raid 1) server load keeps growing until 50-60, then eventually VM's starts dying, then docker apps. If I stop writing to the cache array then load goes down again to normal (< 1). I noticed the problem when using FileBot Docker container to rename (move) files from cache array -> cache array. First 3-4 files are blazing fast, and then the load just keeps growing. Should i provide some logs, while I'm doing the renaming to further investigate this? Edited February 21, 2017 by thomast_88 Quote Link to comment
lionelhutz Posted February 21, 2017 Share Posted February 21, 2017 11 hours ago, thomast_88 said: I have this problem already since 6.3.0-rc5 . It's easy for me to replicate it. If i start writing new files to the cache array (which is 2x 250 gb evos BTRFS raid 1) server load keeps growing until 50-60, then eventually VM's starts dying, then docker apps. If I stop writing to the cache array then load goes down again to normal (< 1). I noticed the problem when using FileBot Docker container to rename (move) files from cache array -> cache array. First 3-4 files are blazing fast, and then the load just keeps growing. Should i provide some logs, while I'm doing the renaming to further investigate this? It's not the same problem, but could be related. This problem causes a CPU core to be pegged at 100% continuously and the web-gui along with other stuff quits responding. It also never recovers. Quote Link to comment
JustinAiken Posted March 9, 2017 Share Posted March 9, 2017 ``` PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 14102 root 20 0 1259740 48048 876 S 200.0 0.3 605:19.65 shfs ``` - I do have cache_dirs, but not enabled for user shares - Mostly reiser drives, but a few XFS - Too froze to get diagnostics or anything Quote Link to comment
ZipsServer Posted March 10, 2017 Share Posted March 10, 2017 Same here. Webgui unresponsive, shfs at 100%, kill-reboot-shutdown not working Quote Link to comment
berizzle Posted March 22, 2017 Share Posted March 22, 2017 Has anyone found a fix for machines in this state? One of my unRAID machines is having the same issue Quote Link to comment
JorgeB Posted March 22, 2017 Share Posted March 22, 2017 Has anyone found a fix for machines in this state? One of my unRAID machines is having the same issue Any reiserfs disks? Quote Link to comment
grither Posted March 22, 2017 Share Posted March 22, 2017 (edited) was just about to post a separate topic when i found this. very similar sounding, server locks up every two days or so. dockers down, can't access webgui. can log in with putty, but can't seem to issue any commands. have had to pull power, obviously not good but nothing else works. can i roll back to an older version of unraid? any issues doing that? EDIT: out of interest, i have two machines, one seems unaffected, however the one i'm referring to has had this issue a couple of weeks Edited March 22, 2017 by grither 1 Quote Link to comment
ixnu Posted March 23, 2017 Share Posted March 23, 2017 I had this EXACT issue and failed to find the root cause after 30 hours of troubleshooting. My problem pointed to a software issue since it happened to me on completely disparate hardware using the same disks. I built a new machine and mounted the RFS disks on snapraid for a stable mount. It wasn't a big deal since I needed to replace my ancient backup server. The problem vanished after I migrated to new disks (including cache) and formatted as XFS. I emailed Tom with suggestion that RFS be completely removed from support, but he is convinced that this is too extreme. 1 Quote Link to comment
lionelhutz Posted March 25, 2017 Share Posted March 25, 2017 (edited) On 3/23/2017 at 11:11 AM, ixnu said: I emailed Tom with suggestion that RFS be completely removed from support, but he is convinced that this is too extreme. It doesn't seem to be only RFS systems that have this issue which points to something else causing it... On 2/16/2017 at 9:29 PM, the_larizzo said: This is happening to me also but all my filesystems are XFS. My server with run for about 2 days then the load goes through the roof and I can no longer run any commands or shut off. I have to hold the power button to restart. Edited March 25, 2017 by lionelhutz 1 Quote Link to comment
the_larizzo Posted March 25, 2017 Share Posted March 25, 2017 I ended up having to fallback to 6.2.4. Server would lock up every 2 days, XFS only. Quote Link to comment
ixnu Posted March 25, 2017 Share Posted March 25, 2017 I do not lurk much, but the first course of action always seems to be move off RFS. It's obviously a difficult problem, but seems to be far more common on RFS n'est pas? Quote Link to comment
JorgeB Posted March 25, 2017 Share Posted March 25, 2017 8 minutes ago, ixnu said: I do not lurk much, but the first course of action always seems to be move off RFS. It's obviously a difficult problem, but seems to be far more common on RFS n'est pas? Agree, seems to be the most common reason, also if the user has at least one xfs disk or a non reiser cache disk it's easy to confirm that is the cause of the problem by limiting all writes to non reiser disks and testing for a few days or weeks. Quote Link to comment
grither Posted March 25, 2017 Share Posted March 25, 2017 56 minutes ago, the_larizzo said: I ended up having to fallback to 6.2.4. Server would lock up every 2 days, XFS only. did this help? please let us know. also, did you have to rebuild all dockers or did they survive the rollback? Quote Link to comment
berizzle Posted March 27, 2017 Share Posted March 27, 2017 On 3/22/2017 at 3:15 AM, johnnie.black said: Any reiserfs disks? Yes of course. Been running this machines for years now. Maybe 5. Just happened again and after a day of it not "coming back" I need to kill the machine. Quote Link to comment
JorgeB Posted March 27, 2017 Share Posted March 27, 2017 2 hours ago, berizzle said: Yes of course. Been running this machines for years now. Maybe 5. Just happened again and after a day of it not "coming back" I need to kill the machine. Reiserfs disks seem to be the #1 reason for this issue, convert one of your disks to XFS, limit all writes to that disk for a few days/weeks by changing your shares(s) included disks and see if crashing stops, if yes convert remaining disks. PS: IMO you should convert even if this isn't the source of the problem, there have been multiple issues with reiser lately and they have terrible performance in certain situations. Quote Link to comment
berizzle Posted March 27, 2017 Share Posted March 27, 2017 1 minute ago, johnnie.black said: Reiserfs disks seem to be the #1 reason for this issue, convert one of your disks to XFS, limit all writes to that disk for a few days/weeks by changing your shares(s) included disks and see if crashing stops, if yes convert remaining disks. PS: IMO you should convert even if this isn't the source of the problem, there have been multiple issues with reiser lately and they have terrible performance in certain situations. I have 23 drives, 21 are Reiserfs 42TB and 2 XFS 6TB. 9TB free over all the drives. Is there any process that makes sense to convert these disks? Quote Link to comment
JorgeB Posted March 27, 2017 Share Posted March 27, 2017 7 minutes ago, berizzle said: I have 23 drives, 21 are Reiserfs 42TB and 2 XFS 6TB. 9TB free over all the drives. Is there any process that makes sense to convert these disks? Since you already have 2 XFS disks you can test before doing the conversion and confirm if that will really help by limiting all your writes to those disks, go to your share(s) and set included disks to only those 2 (all share data on the other disks will still be accessible but all new writes will go to the XFS disks), test for a few days/weeks. To convert see this thread: Quote Link to comment
MisterLas Posted March 28, 2017 Share Posted March 28, 2017 I've been experiencing this since 6.3.2 as well, all XFS disks. Seeing more and more of these threads pop up. I've tried downgrading to 6.3.0, and made it past the 2 day mark that 6.3.2 would die at. I made it to 4 days and it died during the night as well. When I return from work, I'll be downgrading to 6.2.4. While I agree that XFS > reiser, I feel there is something more at play here. Quote Link to comment
lionelhutz Posted March 28, 2017 Share Posted March 28, 2017 I'm curious how many people have converted all their drives to XFS and eliminated the problem. To me, there are far more systems with RFS drives out there so that could explain why more systems with RFS drives have the issue. Quote Link to comment
Clink Posted April 1, 2017 Share Posted April 1, 2017 I have had this problem in a mixed ReiserFS/XFS system; converted all drives to XFS and no shfs lock ups since then (a couple of weeks now).Of course it could have been some weird file/directory structure inconsistency situation that went away when the files/directories were newly created during copying. Quote Link to comment
DavejaVu Posted April 1, 2017 Share Posted April 1, 2017 I finished converting 14 disks from Reiser to XFS about 5 days ago and things seem stable since. I realize this isn't that interesting of a data point because it's not a long time, but that's the longest uptime I've had since upgrading to 6.3.x. And, agreed, I did everything via rsync so it's entirely possible that the act of recreating everything in a new dir structure cleaned up something. Still, it seems that ReiserFS is a quasi-supported relic at this point, so moving off of it seems to be advised. While the rsync method works well, it's a bit tedious...would be nice to have a more automated method for those of us who have been upgrading our systems over the years. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.