wsume99

Members
  • Posts

    531
  • Joined

  • Last visited

Everything posted by wsume99

  1. If it'll make you feel better then go ahead and run memtest but I don't think it's really necessary (I assume it was working fine in your other machine) but others here may disagree. Do you have a discrete NIC you could throw in there? Perhaps your onboard NIC is buggy. Otherwise you'd have to swap out the MB to check the NIC.
  2. Certainly that fact that the RAM passed memtest is a good thing but I do not think that will guarantee your RAM to be trouble free. Case and point - I recently received 2 new sticks of RAM. Ironically it was Kingston ValueRAM very similar to what you are using. I ran 4 passes on memtest with no errors. Installed them in my server and everything worked fine for a few days until I started to get random crashes every few days. Thinking that it might be a compatability issue with my server MB (this specific RAM was not on the QVL) I swapped the RAM between my HTPC & server. I got a BSOD within 30 minutes in my HTPC. So while I waited for the RMA I just split my two good sticks that were originally in the HTPC between the two machines. So the moral of the story is just because your RAM passes memtest don't think that it is defect free. My suggestion would be to try another stick of RAM in your MB - preferably one that has been running in a stable machine for some time. Really this should be your strategy going forward - swap out components one at a time (RAM, PSU, MB/CPU) until you can pinpoint the cause.
  3. I have sabnzbd installed using the unMenu package attached in the OP but I've noticed that it's taking quite a while to repair files. For example I downloaded a 7.2 GB file and it took ~33 minutes to repair the file. Apparently the version of par2cmdline that is included in the package will only use a single core when repairing files. I followed the instructions provided in the post from neilt0 here for ungrading to the multi-core version of par2cmdline. I followed his instructions exactly as they were listed and everything worked perfectly. Now that same 7.2 GB file is repaired in ~17 minutes because it is using both cores in my e5500. It would be nice if the multi-core version of par2cmdline was included in the package but in the meantime there is a valid workaround.
  4. Well I've received 3 2TB HDDs (2-5k3000 & 1-WD20EARS) from Newegg over the last month and they all were packaged the same. Drive in anti-static bag, wrapped in bubble wrap, placed in a small white box that was put inside a larger box with some packing paper. Better than it used to be but still not as good as it could be. If they are now shipping like shown above I would consider that an improvement.
  5. Where did you get your HDDs from in the picture? That looks like some good packaging.
  6. Well I found an Active State Power Management(ASPM) option in my BIOS and set that to DISABLED. That did not work at all. The server booted but the array would not start and eventually my server turned itself off. Time to visit Google. Still trying to figure out how to set the kernel boot parameter pcie_aspm=off. If that's even possible in unraid.
  7. Here is what I've done in the past 3-4 weeks (in order)... Upgraded to 4.7 Removed the jumpers from all my EARS drives (see below) Replaced a WD10EARS data drive with a WD20EARS Added a user script to control the speed of my case fan based on HDD temps Installed new RAM (2x2GB) Installed a non-array drive Installed sabnzbd package onto the non-array drive Reverted back to original RAM (but only using a single 2GB module now) To remove the jumpers I started by upgrading to 4.7 and then preclearing a new EARS drive unjumpered. I then removed a jumpered EARS drive in my array and replaced it with an unjumpered one. After rebuilding the drive and verifying parity I then removed the jumper from the drive I had just replaced and then precleared it. I repeated this sequence until all my jumpered EARS drives were unjumpered. Now I've had this problem I think about 3-4 times in the past month. I believe that this problem (loss of network connection) occurred once before I made any of these changes so I don't believe that these changes have anything to do with my problem - but I suppose I could be wrong. Also about 4-6 weeks ago I began to use my server a lot more for streaming to my HTPC. So I would say that my usage of the server has actually been the biggest change. From what I have read about this issue I think that it is related to the 8111C NIC so it only makes sense that since I recently began using the server a lot more recently that this problem has surfaced.
  8. Basically I got home from work and my wife just says, "What happend to all the movies? XBMC is saying they are not available." So I checked the unRAID and unMenu webGUI but was unable to connect. I tried to telnet and I could not connect. I tried to ping and got no response. I went downstairs to the server and the NIC was still on (i.e. lights were on and blinking) and the switch was showing the server present as well. I pushed the power button to see if the server would reboot and it did. I now have a monitor and a keyboard hooked up so the next time it happens I'll check the console. Here is what I think was happening when the link was lost...I think the server was streaming a movie to my HTPC when the timeout occurred. Again I was not home so I'm not sure. I do know that at 15:56:14 a spindown command was issued to drive 2. My drives are all on 15 min spindown timers so that means that something was accessing the server at ~15:41. The transmit queue problem occurred at 15:44:19. Just to confirm that a movie was playing I asked my wife - "Hey, were you watching a movie on the HTPC this afternoon?" The answer I got was something like this - "I don't know, I don't keep track of what I do with the HTPC." Thanks honey, that helps a lot. I did find this post on the forums where a few other users have had this same problem. It does not appear to happen very often. I guess I'm one of the unlucky ones. The suggestion from Rob J. was to get an intel NIC becasue it seems this issue is specific to the Realtek 8111 NICs. I see this as my last resort. Now I also found this bug report - Bug 538920 - r8169 netdev timeout when aspm is enabled which seems to indicate that setting aspm to off may fix the problem. The bug is reported against Fedora but it is the same hardware (Realtek 8111C) and driver (r8169) that I'm using so I think that it is applicable to unraid. Is this correct? Reading through all the posts I see two possible solutions: 1) Set kernel boot parameter pcie_aspm=off or 2) Use kernel 2.6.37 and newer I'm not even sure if either of these two options would apply to unraid. I don't think that using a newer kernel is possible - doesn't limetech have to do that? But I'm not sure about the aspm parameter. Maybe that is a possibility.
  9. I launched memtest when I got home from work and let it run all night. I stopped it this morning after 20 passes (~13 hours) with 0 errors. So it looks like my RAM is not the problem. I suppose I'm still looking for an explanation as to what these entries from my syslong means ... Apr 5 15:44:19 Tower kernel: WARNING: at net/sched/sch_generic.c:261 dev_watchdog+0xff/0x17f() Apr 5 15:44:19 Tower kernel: Hardware name: C2SEA Apr 5 15:44:19 Tower kernel: NETDEV WATCHDOG: eth0 (r8169): transmit queue 0 timed out Apr 5 15:44:19 Tower kernel: Modules linked in: md_mod xor i2c_i801 i2c_core ahci r8169 Apr 5 15:44:19 Tower kernel: Pid: 0, comm: swapper Not tainted 2.6.32.9-unRAID #8 Apr 5 15:44:19 Tower kernel: Call Trace: Apr 5 15:44:19 Tower kernel: [<c102449e>] warn_slowpath_common+0x60/0x77 Apr 5 15:44:19 Tower kernel: [<c10244e9>] warn_slowpath_fmt+0x24/0x27 Apr 5 15:44:19 Tower kernel: [<c123b505>] dev_watchdog+0xff/0x17f Apr 5 15:44:19 Tower kernel: [<c1037139>] ? sched_clock_cpu+0x136/0x14a Apr 5 15:44:19 Tower kernel: [<c123b406>] ? dev_watchdog+0x0/0x17f Apr 5 15:44:19 Tower kernel: [<c102bb23>] run_timer_softirq+0x105/0x158 Apr 5 15:44:19 Tower kernel: [<c1028261>] __do_softirq+0x84/0xf8 Apr 5 15:44:19 Tower kernel: [<c10282fb>] do_softirq+0x26/0x2b Apr 5 15:44:19 Tower kernel: [<c1028556>] irq_exit+0x29/0x2b Apr 5 15:44:19 Tower kernel: [<c10118f0>] smp_apic_timer_interrupt+0x6f/0x7d Apr 5 15:44:19 Tower kernel: [<c10031f6>] apic_timer_interrupt+0x2a/0x30 Apr 5 15:44:19 Tower kernel: [<c10085f9>] ? mwait_idle+0x4c/0x52 Apr 5 15:44:19 Tower kernel: [<c12108ad>] cpuidle_idle_call+0x28/0x9b Apr 5 15:44:19 Tower kernel: [<c1001a14>] cpu_idle+0x3a/0x4e Apr 5 15:44:19 Tower kernel: [<c129c662>] start_secondary+0x195/0x19a Apr 5 15:44:19 Tower kernel: ---[ end trace ccea7bb31804fb24 ]--- Apr 5 15:44:21 Tower kernel: r8169: eth0: link up Apr 5 15:50:03 Tower kernel: r8169: eth0: link up Apr 5 15:55:27 Tower kernel: r8169: eth0: link up Apr 5 15:56:14 Tower kernel: mdcmd (18): spindown 2 Apr 5 17:19:40 Tower kernel: r8169: eth0: link up
  10. The SMART report looks good to me. I see nothing in there at all that would indicate there is anything wrong with that drive.
  11. @dgaschk - I didn't see your reply until this morning. I checked last night and this morning using the "Show new replies to your posts" link at the top of the forum and there were no replies to this message. However when I checked the actual post there was a reply. I'll launch the memtest when I get home this evening. I'm curious though, did you see something in my syslog that would indicate a problem with my RAM or is this just a standard troubleshooting step that will help isolate the issue?
  12. Yep, the syslog from yesterday when I shut down was in the /boot/logs folder on my flash drive. I've attached a copy of the entire syslog. Here is the part from my syslog that concerns me. These were the last entries before I initiated the powerdown. It looks like something went wrong with the NIC but I don't really know what to make of this. Apr 5 15:44:19 Tower kernel: WARNING: at net/sched/sch_generic.c:261 dev_watchdog+0xff/0x17f() Apr 5 15:44:19 Tower kernel: Hardware name: C2SEA Apr 5 15:44:19 Tower kernel: NETDEV WATCHDOG: eth0 (r8169): transmit queue 0 timed out Apr 5 15:44:19 Tower kernel: Modules linked in: md_mod xor i2c_i801 i2c_core ahci r8169 Apr 5 15:44:19 Tower kernel: Pid: 0, comm: swapper Not tainted 2.6.32.9-unRAID #8 Apr 5 15:44:19 Tower kernel: Call Trace: Apr 5 15:44:19 Tower kernel: [<c102449e>] warn_slowpath_common+0x60/0x77 Apr 5 15:44:19 Tower kernel: [<c10244e9>] warn_slowpath_fmt+0x24/0x27 Apr 5 15:44:19 Tower kernel: [<c123b505>] dev_watchdog+0xff/0x17f Apr 5 15:44:19 Tower kernel: [<c1037139>] ? sched_clock_cpu+0x136/0x14a Apr 5 15:44:19 Tower kernel: [<c123b406>] ? dev_watchdog+0x0/0x17f Apr 5 15:44:19 Tower kernel: [<c102bb23>] run_timer_softirq+0x105/0x158 Apr 5 15:44:19 Tower kernel: [<c1028261>] __do_softirq+0x84/0xf8 Apr 5 15:44:19 Tower kernel: [<c10282fb>] do_softirq+0x26/0x2b Apr 5 15:44:19 Tower kernel: [<c1028556>] irq_exit+0x29/0x2b Apr 5 15:44:19 Tower kernel: [<c10118f0>] smp_apic_timer_interrupt+0x6f/0x7d Apr 5 15:44:19 Tower kernel: [<c10031f6>] apic_timer_interrupt+0x2a/0x30 Apr 5 15:44:19 Tower kernel: [<c10085f9>] ? mwait_idle+0x4c/0x52 Apr 5 15:44:19 Tower kernel: [<c12108ad>] cpuidle_idle_call+0x28/0x9b Apr 5 15:44:19 Tower kernel: [<c1001a14>] cpu_idle+0x3a/0x4e Apr 5 15:44:19 Tower kernel: [<c129c662>] start_secondary+0x195/0x19a Apr 5 15:44:19 Tower kernel: ---[ end trace ccea7bb31804fb24 ]--- Apr 5 15:44:21 Tower kernel: r8169: eth0: link up Apr 5 15:50:03 Tower kernel: r8169: eth0: link up Apr 5 15:55:27 Tower kernel: r8169: eth0: link up Apr 5 15:56:14 Tower kernel: mdcmd (18): spindown 2 Apr 5 17:19:40 Tower kernel: r8169: eth0: link up syslog.zip
  13. Thanks for the help guys but I think after many hours of searching that I finally have my answer. According to this post the unMenu powerdown script already saves the syslog for me. I believe that it is supposed to copy them on the flash drive in /boot/logs/. So I should have the info I need in that file because my server was not responding on the network last night and I had to reboot it. I feel kind of stupid that it was there all the time and I didn't even know it.
  14. Thanks for the reply. I've seen that mentioned before but I was concerned that since my problem is a loss of network comm that the important info might not be posted to the syslog until after the connection was lost so it wouldn't get posted in telnet. I suppose its worth a try. Otherwise I'll have to hook up a monitor and keyboard until I can get this figured out.
  15. Recently my server has started to randomly stop responding on the network. It won't respond to \\Tower\*, directly typing the IP address into the browser, telnet sessions, or even a ping. It is however still running and I am able to power it down by pressing the power button on the case. When it reboots everything is just fine except I've lost the syslog. Is there a way to capture the contents of the syslog as part of the shutdown process so I can see what's happening? I'm trying to avoid having to hookup a monitor and keyboard. I've attached a copy of my syslog after rebooting just FYI. syslog-2011-04-05.txt
  16. ^^You might be right. I usually don't rely on the mfg specs instead I find a good online review (i.e. slientpcreview, anandtech, etc.) where they tested the PSU under different loads and measured its efficiency.
  17. Sorry but that's not quite right. An 80 Plus PSU must have greater than 80% efficiency at 20%, 50%, and 100% load. Many 80 plus PSUs provide less than 80% efficiency at less than 20% load. But your point is valid in that many users who select a large capacity PSU (i.e. >550-600w) to power a small server (<5-6 disks) are going to suffer from poor PSU efficiency when their server is idle/spundown.
  18. Your E5500 has a TDP of 65watt ,, why don't replace it with a Celeron 440 with a 35 watt TDP, you only use 3 drives. and if the Asus E35M1-PRO is already ordered? you can sell your C2SEE to me,, so i have a spare :-) @Rembro- Actually my friend and I both got the $20 C2SEE deal. He went with the 430 and I got the E5500. We both have 3 green drives. My rig idles at 42w and his idles at 40w but I've got a lot more processing power when I need it. What I have learned over time is that TDP is not a great indicator of idle power use. TDP is a better predictor of load power use but it is not really an absolute but rather a general measure for comparison - especially when you're comparing CPUs that are similar. By that I mean a CPU with a 35w TDP won't necessarily use 35w at full load (it really means that it's rated to use that much) but instead that in general a 35w CPU will use less power than a 65w CPU at full load. Sorry, but the E35M1-PRO is going in my HTPC to replace my sempron. Actually I got it yesterday and swapped it out last night. The C2SEE is staying put in my server. I knew I should have bought more than one when they were on sale for $20.
  19. You can do all of this via Disk Management page in unMenu if you have that installed. Any drive not in the array will show up at the bottom of that page. Use the buttons to do the following... 1. Create a reiserfs filesystem on the drive 2. Mount the drive 3. Make the drive writeable (optional) 4. Share the drive
  20. Just to echo Raj's point... Assuming you are running your server 24/7 then for every 1w of power your server uses that equates to ~8.8kwh per year. So if you live in the US where average electricity cost is ~$0.11/kwh we're talking about ~$1 per year per watt.
  21. I used to have a sempron 140 in my server with three 2TB WD Green drives and IIRC my server used ~70w with all drives spun down. I was actually quite disappointed in those numbers and I have read on the forums that the AMD cool 'n quiet drivers are not part of the unraid kernel. I believe there is a way to fix this but I'm still a linux noob and 5 months ago (when I was working this issue) I wasn't even qualified to call myself a noob so I jumped on the $20 Supermicro C2SEE deal and got an intel e5500. Now my server idles at 42w with the same 3 drives. I ended up using the sempron in a new HTPC, but it is now getting replaced by an e350 (ASUS E35M1-M PRO) because I still am not happy with its power use even when cool 'n quiet is working (~55w @ idle). My advice - get the e350 over the atom or a LGA775 over a sempron.
  22. ^^ That worked like a charm. Thanks a lot Joe L. I was using these commands (so close but yet so far) ... mkdir /mnt/disk/sdd1 and mount /dev/sdd1 /mnt/disk/sdd1 But it wasn't working for me. I knew I was missing something. Just wondering, do I need to manually unmount the drive before rebooting my server or will clean powerdown take care of it for me since I have that installed?
  23. ^^That's not odd at all, that's just being wise. I suppose that would be rule #2 for sizing a cache drive - make it as large as your largest drive so that it can serve as a warm spare. That way if one of your array drives fail you can add it to the array a lot quicker than a brand new drive.
  24. I'm trying to get sabnzbd running but I need some help. I'm a little bit of a linux noob but I'm slowly getting better. I have a non-array drive (sdd) installed in my server. After preclearing the drive I used the Disk Management tools in unMenu to put a reiserfs filesystem on sdd1. I then mounted the drive, made it writeable, and shared the drive. Then I installed sabnzbd onto /mnt/disk/sdd1/sabnzbd. The package worked great. I configured sabnzbd, grabbed a nzb and successfully downloaded a file. I then stopped sabnzbd and unmounted the drive using unMenu's Disk Management tools and then rebooted. After rebooting now dev/sdd1/ does not appear on the disk management page. All I see now is /dev/sdd. I was working under the assumption that after the reboot I would just need to remount the drive, make it writeable, and then share it. Perhaps I need to do all this on sdd instead of sdd1. Can someone please point me in the right direction? I really need more detail on how to initially mount my non-array drive and then make it mount automatically after rebooting (I assume I'll be editing my go file for that).