October 29, 200817 yr Ok, this problem bug me for quite sometimes and I hope someone could have same suggestions. Thanks in advance. Some background information first. (1) My ADSL router is at 192.168.0.2 and this router serves as DHCP server as well as gateway for my home network (2) Through web interface of this router I could see how many devices has connect to home network as well as IP address of each devices on the network. (3) My unRAID server is using static IP at 192.168.0.100 netmask 255.255.255.0. gateway 192.168.0.2 (4) I am using a Sandisk Cruze Micro 4GB which I had double check has correct label name of "UNRAID" and has one partition only. The problem is from time to time, it DOES NOT happen each time, sometimes when I power on this unRAID, it looks like it has problem to join my home network. that is from ADSL web interface I CAN NOT see this unRAID is online and in Windows cmd window I can not ping this unRAID at 192.168.0.100 as well. However if I powercycle it again then this time this unRAID server will come up without any problem but when it does come up the parity check will kick in apparently due to "un-clean" shutdown from previous time because I powercycle it. Any idea why this unRAID "sometimes" can not join network but it is ok at next powercycle?
October 29, 200817 yr It sounds like something may be happening with unRAID (intermittently) as it is trying to boot and causing it to never get to the commands that connect to the network. You need to go to the console to see. You MAY still be able to log in and capture a syslog to the USB (see below). Or it may be that the system is hung. If that is the case, take a picture of the screen and post that. If you are running headless you will need to install a video card and attach a monitor. In order to get a syslog when the network isn't working, you will have to go directly to the unRaid server, log in, and enter the command cp /var/log/syslog /boot/syslog.txt It will then be on your USB stick. You can find more information by clicking on the "Troubleshooting" link in my sig.
October 29, 200817 yr You should try installing the power button shutdown script as found in the Power Button Clean Shutdown post in the Feature Requests section. Then, when you hit the power down it will cleanly shut down and also save the log for you. Even if you can never fix the problem this eliminates the parity checks occurring when you don't want them to be. Is the network connection set-up before or after the server is started? I thought starting the server was the last task and if a parity check is being done it appears it was started. However, it could still have had a problem with the network and gave up on the network though. Peter
October 29, 200817 yr You should try installing the power button shutdown script as found in the Power Button Clean Shutdown post in the Feature Requests section. Then, when you hit the power down it will cleanly shut down and also save the log for you. Even if you can never fix the problem this eliminates the parity checks occurring when you don't want them to be. Is the network connection set-up before or after the server is started? I thought starting the server was the last task and if a parity check is being done it appears it was started. However, it could still have had a problem with the network and gave up on the network though. Peter Good point, the array is coming online (as evidenced by the forced parity check). This happens after the network stuff. The syslog will show more. It could be as simple as a bad network cable.
October 30, 200817 yr Author Thanks for all replies. Unfortunately I would not have any console or syslog output until I get a KVM switch such that I can share my monitor with this unRAID. I will post new information later if I get any when it happen again, again this is an intermittent problem. Meanwhile, I read through the thread about power down script however I am confused (1) How do I correctly install this script? Do I follow what Wiki page say or do I download package and install it through installpkg? (2) After installation how do I verify this script has been correctly installed and necessary trigger has been installed? Thanks
October 30, 200817 yr Thanks for all replies. Unfortunately I would not have any console or syslog output until I get a KVM switch such that I can share my monitor with this unRAID. I will post new information later if I get any when it happen again, again this is an intermittent problem. Meanwhile, I read through the thread about power down script however I am confused (1) How do I correctly install this script? Do I follow what Wiki page say or do I download package and install it through installpkg? (2) After installation how do I verify this script has been correctly installed and necessary trigger has been installed? Thanks The scripts are installed with installpkg Not sure which wiki page you are referring to, Please elaborate. The pages here - http://code.google.com/p/unraid-powercontrol/ are accurate and up to date. To test this I would suggest the following. Either stop the array from the web interface first. or run /etc/rc.d/rc.unRAID stop This will not power down your array, but it will stop the array. Thereafter the array is stopped and you can test if the system will handle the power button gracefully. If you watch the console, you should see it run the /etc/rc.d/rc.unRAID script and log some information also saving syslog. If this works, then you can try it with the raid array started to double check.
October 30, 200817 yr I believe you download the file from here http://code.google.com/p/unraid-powercontrol/ copy it to the flash drive in the root put CTRLALTDEL=yes installpkg /boot/powerdown-1.00-noarch-unRAID.tgz on the end of the GO script. I hope that's correct. WeeboTech or JoeL would be the best ones to confirm this. Peter Peter
October 30, 200817 yr Those instructions will work, however we recommend that you make a directory named packages on your /boot or flash drive. Use that to store all of your downloaded slackware packages and use the following syntax CTRLALTDEL=yes installpkg /boot/packages/powerdown-X.YY-noarch-unRAID.tgz where X.YY = version number which I believe is 1.02 So.. CTRLALTDEL=yes installpkg /boot/packages/powerdown-1.02-noarch-unRAID.tgz
October 30, 200817 yr Author Not sure which wiki page you are referring to, Please elaborate. This is the page i was referring to. http://lime-technology.com/wiki/index.php?title=Powerdown_script
October 30, 200817 yr Author I believe you download the file from here http://code.google.com/p/unraid-powercontrol/ copy it to the flash drive in the root put CTRLALTDEL=yes installpkg /boot/powerdown-1.00-noarch-unRAID.tgz on the end of the GO script. I hope that's correct. WeeboTech or JoeL would be the best ones to confirm this. Peter Peter Ha! Yes that is the directory i download package from and the one i highlight in red is what i missed. I thought the "installpkg" should do the installation once for all. but apparently because unRAID is using RAM file system, this "installation" will need to be done again after system reboot.
November 16, 200817 yr Author This problem shown up again today and with console i can do more debug but could not find anything wrong except in ifconfig output first ifconfig output -------------------------------------------------------------------------------------------------------------------------------------- eth0 Link encap:Ethernet HWaddr 00:21:85:1A:CD:0F inet addr:192.168.0.100 Bcast:192.168.0.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:0 errors:0 dropped:4294967253 overruns:0 frame:0 TX packets:0 errors:0 dropped:212 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:0 (0.0 b) TX bytes:0 (0.0 b) Interrupt:16 Base address:0xe000 lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 UP LOOPBACK RUNNING MTU:16436 Metric:1 RX packets:102 errors:0 dropped:0 overruns:0 frame:0 TX packets:102 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:10416 (10.1 KiB) TX bytes:10416 (10.1 KiB) ifconfig again (couple seconds later) -------------------------------------------------------------------------------------------------------------------------------- eth0 Link encap:Ethernet HWaddr 00:21:85:1A:CD:0F inet addr:192.168.0.100 Bcast:192.168.0.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:0 errors:0 dropped:4294967227 overruns:0 frame:0 TX packets:0 errors:0 dropped:212 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:0 (0.0 b) TX bytes:0 (0.0 b) Interrupt:16 Base address:0xe000 lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 UP LOOPBACK RUNNING MTU:16436 Metric:1 RX packets:105 errors:0 dropped:0 overruns:0 frame:0 TX packets:105 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:10728 (10.4 KiB) TX bytes:10728 (10.4 KiB) (1) The RX drop count is a huge number right after system is up. (2) The RX drop count is going backward. Looks to me for some reason the on-board gigabit ethernet is not initialized properly. Once i do a "shutdown -r now" to reboot system, network was back to normal this time. Here is more information (they all looks normal to me) Route output -------------------------------------------------------------------------------------------------------------------- Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 192.168.0.0 * 255.255.255.0 U 0 0 0 eth0 loopback * 255.0.0.0 U 0 0 0 lo default home 0.0.0.0 UG 1 0 0 eth0 ethinfo output --------------------------------------------------------------------------------------------------------------------- driver: r8169 version: 2.2LK-NAPI firmware-version: bus-info: 0000:03:00.0 lsmod outout -------------------------------------------------------------------------------------------------------------------- Module Size Used by md_mod 49992 5 fuse 34580 3 ide_disk 11904 4 pata_jmicron 3712 0 r8169 22148 0 ahci 21508 4 jmicron 2048 0 [permanent] ide_core 88236 2 ide_disk,jmicron libata 122552 2 pata_jmicron,ahci
November 16, 200817 yr Try replacing your cable with a known good CAT5e cable. Also use a diffferent port in your switch. You can also try locking your NIC at 100MB instead of Gigabit, for debugging.
November 17, 200817 yr Author Try replacing your cable with a known good CAT5e cable. Also use a diffferent port in your switch. You can also try locking your NIC at 100MB instead of Gigabit, for debugging. I am no sure how cable can explain that going backward counter problem and as the matter of fact i am using a RoSH compliance 24 AWG Cat6 cable and i have this similar cables for my other machines which are also have Gigabit Ethernet on board or add-on but running Windows all of them are working fine. The switch i am using is D-Link DGS-2208 which i could not see there is any issue. My system is running fine whole day today and i just check it again the ifconfig output still shown the RX drop count is a random huge number. Looks to me the ifconfig is reading counter from wrong address or something like that such that those numbers are bogus. Meanwhile ethtool shown the negotiated speed is 1000mbps.
November 17, 200817 yr Going through the process of elimination is the right thing to do. Rather than arguing logically why each component should work, try a more pragmatic approach and recognize that something is broken. Go through the easy items one at a time. If after eliminating them your unraid still doesn't work, then move onto the next item on the list. We don't suggest solutions randomly, experience is talking. Bill
November 17, 200817 yr Try replacing your cable with a known good CAT5e cable. Also use a diffferent port in your switch. You can also try locking your NIC at 100MB instead of Gigabit, for debugging. I am no sure how cable can explain that going backward counter problem and as the matter of fact i am using a RoSH compliance 24 AWG Cat6 cable and i have this similar cables for my other machines which are also have Gigabit Ethernet on board or add-on but running Windows all of them are working fine. The switch i am using is D-Link DGS-2208 which i could not see there is any issue. My system is running fine whole day today and i just check it again the ifconfig output still shown the RX drop count is a random huge number. Looks to me the ifconfig is reading counter from wrong address or something like that such that those numbers are bogus. Meanwhile ethtool shown the negotiated speed is 1000mbps. I agree with you, the dropped packats number going "backwards" is clearly a bug, an indication of a number that has overflowed it ability to count upwards. Now, it could be that there have really been that many dropped packets (and more), or it could be the realtec driver. I've looked through this thread, and did not find anywhere you described what version of unRAID you are running. Could you let us know? I ask because the realtec drive has had issues in the past, and it seems that with many releases, Tom tries to include the latest drivers. The solution might be as simple as upgrading to the 4.4-beta2 release of unRAID, or dropping back to the 4.3-final if currently on 4.4-beta2. Good News: I did a quick search on google and found the dropped packets counter is indeed a bug. Look here: http://linux.derkeiler.com/Mailing-Lists/Kernel/2008-10/msg04048.html Bad News: they seemed to indicate it is an error in reporting only, and it did not affect connectivity. Clearly, you have an issue with connectivity. joe L.
November 17, 200817 yr Author I've looked through this thread, and did not find anywhere you described what version of unRAID you are running. Could you let us know? I ask because the realtec drive has had issues in the past, and it seems that with many releases, Tom tries to include the latest drivers. The solution might be as simple as upgrading to the 4.4-beta2 release of unRAID, or dropping back to the 4.3-final if currently on 4.4-beta2. Good News: I did a quick search on google and found the dropped packets counter is indeed a bug. Look here: http://linux.derkeiler.com/Mailing-Lists/Kernel/2008-10/msg04048.html Bad News: they seemed to indicate it is an error in reporting only, and it did not affect connectivity. Clearly, you have an issue with connectivity. joe L. my unRAID version is 4.3.3, Mobo is MSI P43 Nero3-F with Ver 1.3 BIOS. One thing I forgot to mention is while i was debugging i also did " /etc/rc.d/rc.inet1 stop" & "/etc/rc.d/rc.inet1 start" try to restart network service but no luck, after that i try to ping 192.168.0.2 which is my ADSL router i got "Unreachable" error. Looks to me this is not layer 3 problem. I just successfully flash this Mobo to Ver 1.5 BIOS from MSI web site but the RX counter problem still exist.
Archived
This topic is now archived and is closed to further replies.