bardsleyb Posted June 22, 2015 Share Posted June 22, 2015 I have been having a problem ever since upgrading to unraid 6 from version 5, with my web interface hanging and not responding, VMs going offline, all docker apps going offline, and shares being not accessible from any other computer on the network. I am able to telnet in, but that is about it. I have had to power off the array manually and let it parity check. This has happened about once every couple of days since the upgrade over the last several weeks. I am just now getting around to attaching the log and asking somebody about it. Maybe I missed a config file or put something in that i shouldn't have? I followed the step by step upgrade instructions in the Wiki but who knows, I could have missed something.... syslog.zip Quote Link to comment
jonp Posted June 22, 2015 Share Posted June 22, 2015 Please attach diagnostics file. From under the tolls -> diagnostics page. Quote Link to comment
bardsleyb Posted June 22, 2015 Author Share Posted June 22, 2015 Please attach diagnostics file. From under the tolls -> diagnostics page. Here you go. Thank you in advance.... tower-diagnostics-20150622-1114.zip Quote Link to comment
bardsleyb Posted June 22, 2015 Author Share Posted June 22, 2015 It may be worth mentioning, that I have already booted my server from the failure this morning, so that file should reflect that. If I need to try and get that diagnostic file BEFORE I boot my server next time, then let me know how to do that, and I certainly will. It will need to be command line instructions though, as my web interface will not work of course. Quote Link to comment
itimpi Posted June 22, 2015 Share Posted June 22, 2015 It may be worth mentioning, that I have already booted my server from the failure this morning, so that file should reflect that. If I need to try and get that diagnostic file BEFORE I boot my server next time, then let me know how to do that, and I certainly will. It will need to be command line instructions though, as my web interface will not work of course. The same file can be created using the 'diagnostics' command from a console/telnet session. It will be placed in the 'logs' folder on the flash drive. Quote Link to comment
bardsleyb Posted June 23, 2015 Author Share Posted June 23, 2015 It may be worth mentioning, that I have already booted my server from the failure this morning, so that file should reflect that. If I need to try and get that diagnostic file BEFORE I boot my server next time, then let me know how to do that, and I certainly will. It will need to be command line instructions though, as my web interface will not work of course. The same file can be created using the 'diagnostics' command from a console/telnet session. It will be placed in the 'logs' folder on the flash drive. I guess my search is broken because I have looked for the "diagnostics" command from the CLI, and I could not find the right syntax. Does anyone have this in case this happens to me again? I would like to be able to dump that diagnostics file onto my flash drive before I power down if this happens again. Quote Link to comment
itimpi Posted June 23, 2015 Share Posted June 23, 2015 It may be worth mentioning, that I have already booted my server from the failure this morning, so that file should reflect that. If I need to try and get that diagnostic file BEFORE I boot my server next time, then let me know how to do that, and I certainly will. It will need to be command line instructions though, as my web interface will not work of course. The same file can be created using the 'diagnostics' command from a console/telnet session. It will be placed in the 'logs' folder on the flash drive. There is no syntax. Simply type that command press Enter and then when it completes look in the 'logs' folder on the flash drive for the zip file that has been created. I guess my search is broken because I have looked for the "diagnostics" command from the CLI, and I could not find the right syntax. Does anyone have this in case this happens to me again? I would like to be able to dump that diagnostics file onto my flash drive before I power down if this happens again. Quote Link to comment
bardsleyb Posted June 28, 2015 Author Share Posted June 28, 2015 happened again..... attached log files. I tried the diagnostics command via telnet session and got this.... root@Tower:~# diagnostics cp: cannot stat ‘/boot/config/*.conf’: No such file or directory syslog.txt Quote Link to comment
bardsleyb Posted June 28, 2015 Author Share Posted June 28, 2015 I really need that diagnostics because I am betting it is something not in these logs that is going to tell us the story on what is going on. I don't see anything in this small log file that shows anything wrong at all. The last thing I did was attempt to delete files using Sonar in a docker app. Then the GUI locked up and my Plex went offline. I waited for about an hour before force rebooting. The server did run for about 4 and a half days without incident. I am thinking my log file is not showing that much time though. Is there a way to extend the log file size allowance so that it records longer? Quote Link to comment
itimpi Posted June 28, 2015 Share Posted June 28, 2015 happened again..... attached log files. I tried the diagnostics command via telnet session and got this.... root@Tower:~# diagnostics cp: cannot stat ‘/boot/config/*.conf’: No such file or directory As far as I know this message does not stop the diagnostics zip file being created. Have you tried looking in the 'logs' folder on the flash drive to see if it is there? One change I would recommend that is made to the diagnostics script for a future unRAID release is an information message saying that the zip file has been created and giving its exact name (including location). This will help those who run it via the command line. Quote Link to comment
bonienl Posted June 28, 2015 Share Posted June 28, 2015 One change I would recommend that is made to the diagnostics script for a future unRAID release is an information message saying that the zip file has been created and giving its exact name (including location). This will help those who run it via the command line. Good idea, I add that for the next version. Thanks Note: the error message of the copy command should not stop diagnostics from creating the zip file, this message will be suppressed in the next version. Quote Link to comment
bardsleyb Posted June 28, 2015 Author Share Posted June 28, 2015 It did copy, you were correct. Thank you, and I have included that file in this post. tower-diagnostics-20150627-2242.zip Quote Link to comment
bardsleyb Posted July 6, 2015 Author Share Posted July 6, 2015 Happened again last night. Here is more logs.... tower-diagnostics-20150705-2031.zip Quote Link to comment
spectorus Posted July 7, 2015 Share Posted July 7, 2015 This seems to be occurring to me also. I found this searching for an answer, next time it occurs I will add logs also. Quote Link to comment
itimpi Posted July 7, 2015 Share Posted July 7, 2015 Happened again last night. Here is more logs.... Both of the logs that were posted are full of I/O errors which I expect is the root cause of your issues. I think they are on device sdd that appears to be in BTRFS format - is that you cache drive? Quote Link to comment
bardsleyb Posted July 7, 2015 Author Share Posted July 7, 2015 Happened again last night. Here is more logs.... Both of the logs that were posted are full of I/O errors which I expect is the root cause of your issues. I think they are on device sdd that appears to be in BTRFS format - is that you cache drive? Yes it is. It has an offline uncorrected value of 3. It passes the smart test but it has prefail values on it too. I had plans to change it out soon. Maybe I should do that sooner rather than later? Quote Link to comment
itimpi Posted July 7, 2015 Share Posted July 7, 2015 Happened again last night. Here is more logs.... Both of the logs that were posted are full of I/O errors which I expect is the root cause of your issues. I think they are on device sdd that appears to be in BTRFS format - is that you cache drive? Yes it is. It has an offline uncorrected value of 3. It passes the smart test but it has prefail values on it too. I had plans to change it out soon. Maybe I should do that sooner rather than later? The pending sectors mean that at best you are going to get inconsistent results any time the system tries to read those sectors, and if they are in an important area of the disk could cause all sort of problems. I think you need to get that issue resolved and then see if you still get UI hanging. Those sort of errors are sometime corrected if you run a pre-clear cycle against the disk. Also you could try the manufacturers diagnostic software to see if it can clear the errors. Quote Link to comment
dgaschk Posted July 7, 2015 Share Posted July 7, 2015 http://lime-technology.com/wiki/index.php/Troubleshooting#Resolving_a_Pending_Sector Quote Link to comment
Econaut Posted July 11, 2022 Share Posted July 11, 2022 Is there any way to recover from an unresponsive UI via terminal? I can SSH in but the web UI is totally unresponsive. All I did was hit apply after changing one option on a docker container... Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.