December 29, 201510 yr I've no idea how to read the syslog, or what to search for. Can anyone tell me if there is anything obvious in it that might be causing it to become unresponsive either by the GUI or by telnet? Here is a short history: About 6 mos. ago, Unraid was becoming unresponsive every few minutes. I replaced the power supply (which was a crappy one) and things were better. About 3 mos. ago, Unraid was becoming unresponsive every few minutes. someone pointed out to me that my router was pinging and doing DHCP stuff about every 6 seconds -- router firmware issue. I changed it and now it talks to the router each hour. Also, they mentioned a possible issue with the NIC. I moved to the other NIC on the board. It was reliable for months. Last week, I connected to a share from a computer that plays music. The player was not working great, and so I mapped a network drive to the share and launched Windows Media Player and let it build the library based on that share. Since then, Unraid becomes unresponsive every few minutes. I've restarted it about 20 times or more and it works for a few minutes, then no GUI and no telnet connection. Any help is GREATLY appreciated, because I'm in the middle of a video project for work and all my files are stored on the Unraid server and I can no longer access them. unraid-syslog-20151229-0835.zip
December 29, 201510 yr Community Expert There is nothing obvious from the syslog that I can see, but the log supplied just covers the boot up sequence. Ideally you would obtain the diagnostics (via Tools->Diagnostics or by running the 'diagnostics' command from a command line session) as this provides much more information on your setup than just the syslog and get it at a point where a problem is occurring. However if you are losing all access this is obviously not possible. Do you have a monitor/keyboard attached. Your description sounds rather like a hardware issue that is causing the server to crash. If so something might be displayed on an attached monitor. It could be worth booting into the memory test option and letting that run for some hours as failing RAM can have all sorts of unpredictable side-effects.
December 29, 201510 yr Author I've attached the diagnostics file. After disconnecting the mapped drive and removing the link to the share and avoiding browsing/using the files, it has kept running for hours. I DID see, right before one crash, that Disc 3 showed that it was running hot. I've ordered a couple of spare drives in case that is the issue -- also because a couple of the drives are pretty full. I do have a monitor/keyboard attached. The next time it goes down, I'll try the memory test. unraid-diagnostics-20151229-1357.zip
January 4, 201610 yr Author Okay, this it the follow up... It appeared that Disk 3 was showing some errors and was overheating regularly. I ordered some new drives and installed one over the weekend in place of Disk 3. The system rebuilt to the new drive and parity was restored. Although this MAY have been an issue, it was not THE issue that has been causing me issues... I then navigate to a share from my PC, open a folder, then another, then another which contains about 40 photos from Christmas. I open a photo in the MS Photos application and begin scrolling through them - quickly. before I get through them all, I lose my connection to the share, the GUI won't load, and Telnet is useless. I used the console with a local keyboard to obtain the current syslog using the tail command. It reported the following: Jan 4 12:30:07 UNRAID ntpd[1342]: new interface(s) found: waking up resolver Jan 4 12:31:25 UNRAID kernel: sky2 0000:02:00.0 eth0: tx timeout Jan 4 12:31:25 UNRAID kernel: sky2 0000:02:00.0 eth0: transmit ring 1 .. 26 report=1 done=1 Jan 4 12:31:27 UNRAID ntpd[1342]: Deleting interface #20 eth0, 192.168.1.150#123, interface stats: received=0, sent=1, dropped=0, active_time=80 secs Jan 4 12:31:27 UNRAID ntpd[1342]: 198.55.111.50 local addre 192.168.1.150 -> <null> Jan 4 12:31:28 UNRAID kernel: sky2 0000:02:00.0 eth0: Link is up at 1000 Mbps, full duplex, flow control rx Jan 4 12:31:30 UNRAID ntpd[1342]: Listen normally on 21 eth0 192.168.1.150:123 Jan 4 12:31:30 UNRAID ntpd[1342]: new interface(s) found: waking up resolver -Then I sent a poweroff command... It logged the follwoing: -eth0 tx timeout -transmit ring 1 .. 26 report=1, done=1 -Deleting interface #21 eth0, 192.168.1.150#123, received 0, sent 2, dropped 0, active time 75secs -198.55.11.50 local address 192.168.1.150 -> <null> -eth0: Link is up at 1000Mbps, full duplex, flow control rx -Listen normally on 22 eth0 192.168.1.150:123 -new interface found - waking up resolver -eth0 tx timeout ...and so on. Seems to happen every minute. This may be DHCP - if so, I may assign a static IP after all. Any other thoughts?
January 4, 201610 yr Author Latest Diagnostics -- Any suggestions? unraid-diagnostics-20160104-1634.zip
January 4, 201610 yr I've no idea how to read the syslog, or what to search for. Can anyone tell me if there is anything obvious in it that might be causing it to become unresponsive either by the GUI or by telnet? Here is a short history: About 6 mos. ago, Unraid was becoming unresponsive every few minutes. I replaced the power supply (which was a crappy one) and things were better. About 3 mos. ago, Unraid was becoming unresponsive every few minutes. someone pointed out to me that my router was pinging and doing DHCP stuff about every 6 seconds -- router firmware issue. I changed it and now it talks to the router each hour. Also, they mentioned a possible issue with the NIC. I moved to the other NIC on the board. It was reliable for months. Last week, I connected to a share from a computer that plays music. The player was not working great, and so I mapped a network drive to the share and launched Windows Media Player and let it build the library based on that share. Since then, Unraid becomes unresponsive every few minutes. I've restarted it about 20 times or more and it works for a few minutes, then no GUI and no telnet connection. Any help is GREATLY appreciated, because I'm in the middle of a video project for work and all my files are stored on the Unraid server and I can no longer access them. Just out of curiosity what version are you running and have you upgraded recently? I'm one of a couple of people who've posted a similar problem where in we've been running version 5 with no problems whatsoever for an extended period of time then after upgrading to 6 started having problems with our servers becoming unresponsive. In my case not as frequently as you, after approximately three days of being up with no errors or problems it just stops responding. The last time it happened I had access to the console for a short time before it completely locked up and noted a process had pegged my CPU at 100% for and extended period before it went completely dark. I know I can resolve this problem by rolling back to version 5 but I sure hate to now that I have the docker running SABNZBd and Sonarr.
January 4, 201610 yr Author I'm on Ver. 6.1.6 -- upgraded several weeks ago from 6.1.?. As long as I don't access the data, mine seems to run forever. So, I suppose one fix would be to take it off the network and then it will keep my data secure.
January 4, 201610 yr Hi TedatTNT- I think i read that you are using DHCP for your server? I would probably recommend a static IP address anyway. Not sure that this would be causing the issues you describe but at least it would eliminate that variable.
January 5, 201610 yr Author I think i read that you are using DHCP for your server? I would probably recommend a static IP address anyway. Can you tell me why? My IP address doesn't change -- it is reserved by the router using the MAC address. Perhaps it is my networking background, but I RARELY like to use static IP's and much prefer reserving them on the router/switch upstream from them in case of network changes that I make. If there REALLY is a benefit to having a static IP, I'll change it, but I'd rather not change if it if the only concern is that the server could be assigned a different IP -- my reservation on the router will prevent that. Thanks for any clarification on this. Ted
January 5, 201610 yr I think i read that you are using DHCP for your server? I would probably recommend a static IP address anyway. Can you tell me why? My IP address doesn't change -- it is reserved by the router using the MAC address. Perhaps it is my networking background, but I RARELY like to use static IP's and much prefer reserving them on the router/switch upstream from them in case of network changes that I make. If there REALLY is a benefit to having a static IP, I'll change it, but I'd rather not change if it if the only concern is that the server could be assigned a different IP -- my reservation on the router will prevent that. Thanks for any clarification on this. Ted Ted, what you're doing is just fine, you've essentially got a static IP with the setup you've got albeit router side rather than server side. FWIW I do exactly the same as you. I much prefer handling IP addresses all in one location at the router than changing individual machines one by one...
January 5, 201610 yr Community Expert I always reserve IP by MAC on my router also and never have a problem. What does your router have for lease time?
January 5, 201610 yr Author Okay, everything is up and running, new drive is in place and parity is again established. Memtest found no errors. It has been running overnight and this is my latest diagnostic file. I really don't understand what I'm looking at -- does anyone understand how to read these, and does it indicate any issue? I'm still certain that I can crash Unraid by quickly scrolling through pictures in a folder from my computer. Oh, and just to restate, by crash, what I mean is that it becomes inaccessible -- no shares on the network, no drives on the network, no GUI, and no Telnet access. I can still interact with the console, but that is all. Ted unraid-diagnostics-20160105-0836.zip
January 5, 201610 yr Community Expert ... Oh, and just to restate, by crash, what I mean is that it becomes inaccessible -- no shares on the network, no drives on the network, no GUI, and no Telnet access. I can still interact with the console, but that is all. Have you tried network access from a different computer when this happens?
January 5, 201610 yr Community Expert i have something a bit related... every few days, the server becomes un-responsive after i upgraded to 6.1.6. interestingly, i can acccess sickbeard and plex (running from dockers) but can't access sab, or the unraid webgui, or any shares. very frustrating, i have to power down, that's the only way to restore connection
January 5, 201610 yr To me, it looks like the NIC is crapping out and restarting. It's not a Realtek by any chance? What hardware are you running (particularly motherboard), TedatTNT?
January 5, 201610 yr Author My board is the Asus P5P-DH Deluxe - with dual LAN (Gb) 2 x Marvell 88E8053 Gigabit LAN Controller, both featuring AI NET2
January 5, 201610 yr Author trurl - yes, when I lose the shares/drives and the web GUI, I lose it from each PC that can normally access it.
January 7, 201610 yr Author Okay, I have tried several tests, accessing (or, attempting to access) the server from the console, telnet, and the web GUI from multiple computers. I've checked through logs, I've monitored the drives, and I have performed MANY restarts. I am thinking that my problem is between the onboard NIC (dual GB Marvell 88E8053 - approved hardware) and Unraid. I checked the BIOS. I am doing no overclocking or AI stuff. I'd originally disabled most extra features or components that I'm not using. I just set DRAM ECC to auto instead of disabled (in case it helps), but I'm guessing since I'm not using server-class RAM, this won't do anything. I've switched to the other NIC, and I have the both enabled in BIOS (previously, only the other was enabled). FYI, I'm running a Core2 Quad 6600 processor and 8Gb of RAM on this build. As of right now, the method that I'd used a dozen times to test reliability, only to lose connection to the server, is not working. I can't break it. So, the only thing different is the ECC change and moving to a different NIC. Fingers crossed...
December 1, 20169 yr i have something a bit related... every few days, the server becomes un-responsive after i upgraded to 6.1.6. interestingly, i can acccess sickbeard and plex (running from dockers) but can't access sab, or the unraid webgui, or any shares. very frustrating, i have to power down, that's the only way to restore connection Did this ever resolve for you? My issue is similar. I can connect to the apps that started, but my dockers never start, and when I hang up the log just says, "ntpd[1577]: new interface(s) found: waking up resolver" and never moves past that... everything was working fine, and then this started randomly happening!
Archived
This topic is now archived and is closed to further replies.