twg

Members
  • Posts

    126
  • Joined

  • Last visited

Everything posted by twg

  1. wow... i guess it's my old WD drives that's making my parity checks 15+h long! It also looks like disk 5 might be failing ? I don't see an errors or sector reallocations but it's quite a bit slower than the other two 1.5TB drives... Also quite surprisngly, the hitachi 2TB drives are slower than the Seagate 2TB, although I think the latter are newer drives so maybe use higher density platters... think it's time to replace my 1.5TB drives... diskspeed.zip
  2. with the latest version 2.1 I found that it wouldn't find my USB key properly, I had to make the following changes for it to work: OLD: line 156: tmp2=$((${#tmp1} - 4)) NEW: line 156: tmp2=$((${#tmp1} - 3)) not sure if this is just a hack for something else that is causing my problem but on my system it identified my usb key as: root@Tower:/boot/scripts# ls -l /dev/disk/by-label > /tmp/flashid root@Tower:/boot/scripts# cat /tmp/flashid total 0 lrwxrwxrwx 1 root root 9 2014-01-13 22:31 UNRAID -> ../../sda and $FlashID ended up being "/sd" and thus the script tried to test the speed of my USB key and resulted in the end of disk error...
  3. Thanks! Tomm just responded to me now. I have been battling a cold/flu for the last 4 weeks... rest up and drink lots of water.
  4. Hi everyone It's been 4 days since I emailed Tom about my corrupt flash drive but no response. Has anyone else fears from him lately? Not sure if he is on vacation or just getting swamped with work. My unraid server is somewhat unreliable right now due to flash and I'd like to replace it asap
  5. couple of issues... usb key was going into read only mode... and also the secrets.tdb file was corrupt so I had to recreate it via samba (do a search here, there are other threads on this) looks like my usb key is probably failing so will pick up a new key soon. problem fixed.
  6. i'm a linux noob so not really sure what the errors mean... i know that my SAMBA doesn't run when powering on my server. The errors in the syslog appear to be related to corrupt file system on the usb key and something about the super.dat file... which i know can be serious. I haven't tried anything else for fear of making things worst. Suggestions ? syslog-2013-12-26.txt
  7. ok, this is weird, after about a month of random crashes as a result of OOM... and many hours troubleshooting, I've finally narrowed the OOM down to my RSYNC command... and it appears that my mount from above wasn't working... AGAIN... if I type the mount command on the commandline... it mounts fine... but when it's in my GO script it doesn't mount... there are no errors in the syslog either... what gives ?? I was using: mount.cifs //192.168.0.105/Volume_1 /mnt/dns323 -o guest I also tried: mount -t cifs //192.168.0.105/Volume_1 /mnt/dns323 -o guest both work on the commandline but not in my go script... pulling my hair out here...
  8. anyone figure out how to reduce these log entries ? I have a bunch of these too...
  9. actually I ran that a couple of hours ago, after the server had been up for at least a day and a half, and after I ran a parity check, and re-enabled my swap file... I'll keep monitoring it, now that I know the commands... I'll check it probably later this week when I have more time and re-enable 3.6.8 and see what happens... appreciate your help!
  10. running "du / -ahxd1" gives me an invalid 'd' and invalid '1' option. Running without the d1 gives me 223M, which isn't all that much more than stock unRaid... certainly nothing to overly consume my memory I don't think... running the grep command gives me: unevictable: 229920 kB which matches the above more or less... this is after I removed the 3.6.8 samba that I was manually including... I think that was the problem... to confirm, I'll re-enable the 3.6.8 in my go file and see if my problems come back...
  11. while digging through my "go" script, I realized that I was manually installing Samba 3.6.8... when I had rc8 it came with 3.6.7 so I must have had a reason to use a newer version. Now that 5.0 Final is installed, it comes with Samba 3.6.10 so I was actually forcing the use of an older version of Samba... I've removed this, doing a parity check now... see if that solves the problem...
  12. thanks your suggestion helped... for some odd reason, I had to make changes to my DNS-323's smb.conf file changing SECURITY=SHARE to SECURITY=USER and then with your suggested syntax, it worked again... strange that this used to work for over a year and now I had to make these changes... might be related to the new Samba going from rc8 to 5.0 final.
  13. i have a D-Link 323 NAS, I use this as my 2nd source backup. I used to mount this to unRaid as follows in my go script: mkdir -p /mnt/dns323 mount //dns323/Volume_1 /mnt/dns323 -o passwords= The DNS323 was setup with no passwords and anyone can connect, this used to work for the longest time but I just realized that recently, this stopped working. When manually typing in the commands I get: root@Tower:/# mkdir -p /mnt/dns323 root@Tower:/# mount //dns323/Volume_1 /mnt/dns323 -o passwords= mount error(13): Permission denied Refer to the mount.cifs( manual page (e.g. man mount.cifs) when I omit the "-o passwords=" option, it asks for a password for the mount but always kicks back the same error as above ? What gives ?
  14. now i'm not so sure what's going on... last night, i wasn't doing anything, swapfile was off... and this morning unraid reported running out of memory so OOM killed a whole bunch of stuff, SAMBA was off, but luckily the server was still running so captured the following logs.. what do you think ? results of free -lm : total used free shared buffers cached Mem: 4017 3993 23 0 64 3733 Low: 846 835 11 High: 3170 3157 12 -/+ buffers/cache: 195 3821 Swap: 0 0 0 this is really starting to get annoying, i love unRaid when it works, but its these boundry cases of unstability that I just don't have time to deal with syslog.txt
  15. another corner case to consider which was affecting me... i have jumbo frames setup, but somehow one of my computers NIC drivers got updated during a windows update. It reset all the jumbo frame settings, and I suddenly experienced video stutter streaming from unRaid. File copy transfers were still very fast... I eventually figured this out and set the proper jumbo frame values in the NIC drivers, and 1080p HD vid streaming is back to normal now...
  16. so i installed the swapfile plug in for 5.0 and enabled it. And this morning, lo and behold, the server was exhibiting the same problems. Attached is the syslog and below is output of free -lm: total used free shared buffers cached Mem: 4017 3994 22 0 76 3709 Low: 846 835 11 High: 3170 3159 11 -/+ buffers/cache: 207 3809 Swap: 4095 8 4087 syslog.txt
  17. do you have a swap file ? try disabling the swap file and see if that helps ?
  18. i pretty much had no add ons except for istat which i installed as a debugging tool. i disabled the swap space and tried running parity check and i didnt encounter the problem. i'll re-enable the swap space and run parity chk to confirm it was related to the swwap space.
  19. console was still responsive so I copied the syslog and did a free -lm and copied it to flash. The syslog is pretty empty... but i noticed on the console screen that many processes were killed... probably due to OOM... syslog-20130918-045006.txt mem.txt
  20. I'll try to get a log but it's hard because when it happens the telnet server gets killed and Samba gets killed too... the last few snippets of syslog that I was able to get showed I got a status parity check mail at 447am, I started it around 11... then it started killing processes...
  21. So I was running rc8a for the longest time, no issues. Didn't really have many packages/plugins. I had mySql, uumenu, minidlna and that's pretty much it (cache_dirs too). I just upgraded to 5.0 Final and everything seemed to be ok. Then I realized that whenever the server would do a parity check, I'd run out of memory. I disabled all my plugins/packages and it's still doing it. I can see that OOM is being called which means memory is running out and it would start to kill samba, telnet and sometimes uuMenu as well... attached is the last few snippets of syslog when it happens, nothing really obvious: Sep 18 04:47:01 Tower crond[1206]: ignoring /var/spool/cron/crontabs/root- (non-existent user) Sep 18 04:47:19 Tower sSMTP[9024]: Creating SSL connection to host Sep 18 04:47:19 Tower sSMTP[9024]: SSL connection using RC4-SHA Sep 18 04:47:23 Tower sSMTP[9024]: Sent mail for root@[email protected] (221 2.0.0 closing connection y9sm3799500qaj.9 - gsmtp) uid=0 username=root outbytes=1870 Sep 18 04:50:05 Tower kernel: shfs invoked oom-killer: gfp_mask=0x200da, order=0, oom_score_adj=0 Sep 18 04:50:06 Tower kernel: Pid: 14331, comm: shfs Tainted: G O 3.9.6p-unRAID #23 I do have a swap file still set up, and have now installed the iStat package so able to remotely monitor resources from my iPad and I see a huge amount of pagein requests for the swap file... which is probably keeping the system up, ie. parity check is still running but everything else is pretty much stopped. Anyone have any thoughts ? I never had this issue previously with 5.0 rc8a...
  22. Long story short, one of my drives died (refused to spin up) so I moved my cache drive over and restarted. I had a new 4TB drive sitting around so I then upgraded from 4.7 to 5.0-rc8a, replaced my 2TB parity drive with the 4TB drive. Everything working fine. After examining the syslog tonight, everything looks good except the following: Jan 5 21:04:16 Tower kernel: sas: ata9: end_device-7:0: dev error handler (Errors) Jan 5 21:04:16 Tower kernel: sas: ata10: end_device-7:1: dev error handler (Errors) Jan 5 21:04:16 Tower kernel: sas: ata11: end_device-7:2: dev error handler (Errors) Jan 5 21:04:16 Tower kernel: sas: ata12: end_device-7:3: dev error handler (Errors) Jan 5 21:04:16 Tower kernel: sas: ata13: end_device-7:4: dev error handler (Errors) Jan 5 21:04:16 Tower kernel: ata13.00: ATA-8: WDC WD15EADS-00P8B0, 01.00A01, max UDMA/133 (Drive related) Jan 5 21:04:16 Tower kernel: ata13.00: 2930277168 sectors, multi 0: LBA48 NCQ (depth 31/32) (Drive related) Jan 5 21:04:16 Tower kernel: ata13.00: configured for UDMA/133 (Drive related) Jan 5 21:04:16 Tower kernel: sas: --- Exit sas_scsi_recover_host: busy: 0 failed: 0 (Drive related) Jan 5 21:04:16 Tower kernel: sd 7:0:3:0: [sdj] Write Protect is off (Drive related) Jan 5 21:04:16 Tower kernel: sd 7:0:3:0: [sdj] Mode Sense: 00 3a 00 00 (Drive related) Jan 5 21:04:16 Tower kernel: sd 7:0:2:0: [sdi] Attached SCSI disk (Drive related) Jan 5 21:04:16 Tower kernel: sd 7:0:3:0: [sdj] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA (Drive related) Jan 5 21:04:16 Tower kernel: sdj: sdj1 (Drive related) Jan 5 21:04:16 Tower kernel: sd 7:0:3:0: [sdj] Attached SCSI disk (Drive related) Jan 5 21:04:16 Tower kernel: scsi 7:0:4:0: Direct-Access ATA WDC WD15EADS-00P 01.0 PQ: 0 ANSI: 5 (Drive related) Jan 5 21:04:16 Tower kernel: sd 7:0:4:0: [sdk] 2930277168 512-byte logical blocks: (1.50 TB/1.36 TiB) (Drive related) Jan 5 21:04:16 Tower kernel: sd 7:0:4:0: Attached scsi generic sg10 type 0 (Drive related) Jan 5 21:04:16 Tower kernel: sas: Enter sas_scsi_recover_host busy: 0 failed: 0 (Drive related) Jan 5 21:04:16 Tower kernel: sas: ata9: end_device-7:0: dev error handler (Errors) Jan 5 21:04:16 Tower kernel: sas: ata10: end_device-7:1: dev error handler (Errors) Jan 5 21:04:16 Tower kernel: sas: ata11: end_device-7:2: dev error handler (Errors) Jan 5 21:04:16 Tower kernel: sas: ata12: end_device-7:3: dev error handler (Errors) Jan 5 21:04:16 Tower kernel: sas: ata13: end_device-7:4: dev error handler (Errors) Jan 5 21:04:16 Tower kernel: sas: ata14: end_device-7:5: dev error handler (Errors) anyone have an idea what these errors are ? syslog.zip
  23. i'm intrigued... how does one benefit from NIC teaming ? I see you can almost double your throughput, but do I need anything special in the rest of my network to benefit from the increased throughput ?
  24. well for whatever reason, miniDLNA's scanner chokes on about 9 movies that I have out of over a thousand, I removed those movies and the scanner was able to finish building the database. Everything works now. In the process I noticed there's a new version 1.0.25, I tried using it, and got it half working, not sure why it didn't fully work though... but i guess I'll leave it at 1.0.24 since it works.
  25. thanks for the rebuild... but I notice that minidlna now chokes on certain files when building the DB... esp. long filenames... removing the offending file helps but i have a lot of files that it chokes on...